<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
<channel>
<title>Shanghaiist: What?! Another firewall?!</title>
<link>http://shanghaiist.com/2006/09/02/what_another_fi.php</link>
<description>All comments for What?! Another firewall?!</description>
<language>en-us</language>
<copyright>2009 shang_kenneth</copyright>
<lastBuildDate>Wed, 07 Jan 2009 05:00:13 +0800</lastBuildDate>
<docs>http://blogs.law.harvard.edu/tech/rss</docs>
<managingEditor>kenneth@shanghaiist.com</managingEditor>
<webMaster>kenneth@shanghaiist.com</webMaster>
<ttl>60</ttl>
<item>
<title>spirited</title>
<link>http://shanghaiist.com/2006/09/02/what_another_fi.php#comment-369348</link>
<guid isPermaLink="true">http://shanghaiist.com/2006/09/02/what_another_fi.php#comment-369348</guid>
<category>Comments</category>
<pubDate>Mon, 18 Sep 2006 23:12:01 +0800</pubDate>
<description>&lt;p&gt;It is obvious, but boring, you are a fool.  Of course its is a long and obtuse article, but invaluable to those of us that like to read our webages unhindered in China...Just because you are too dense to appreciate it doesn;t mean anyone else is...I have nothing but contempt for you...Go listen to your Paris Hilton cd...&lt;/p&gt;</description>
</item><item>
<title>nanheyangrouchuan</title>
<link>http://shanghaiist.com/2006/09/02/what_another_fi.php#comment-335578</link>
<guid isPermaLink="true">http://shanghaiist.com/2006/09/02/what_another_fi.php#comment-335578</guid>
<category>Comments</category>
<pubDate>Mon, 04 Sep 2006 10:46:25 +0800</pubDate>
<description>&lt;p&gt;@boring:

f off&lt;/p&gt;</description>
</item><item>
<title>degun66</title>
<link>http://shanghaiist.com/2006/09/02/what_another_fi.php#comment-335528</link>
<guid isPermaLink="true">http://shanghaiist.com/2006/09/02/what_another_fi.php#comment-335528</guid>
<category>Comments</category>
<pubDate>Mon, 04 Sep 2006 10:19:04 +0800</pubDate>
<description>&lt;p&gt;It is always like that in China... surprise that you only discover that now... 

many websites are blocked nearly everywhere, and we do not know why they work in some buildings. for example a blog that I know has been block in china, But a few peope in a few companies or a few buildings where still able to access it without Proxy.

And of couse no need to ask an explanation from the chiense gov, you will never get one.

88&lt;/p&gt;</description>
</item><item>
<title>Boring</title>
<link>http://shanghaiist.com/2006/09/02/what_another_fi.php#comment-335426</link>
<guid isPermaLink="true">http://shanghaiist.com/2006/09/02/what_another_fi.php#comment-335426</guid>
<category>Comments</category>
<pubDate>Mon, 04 Sep 2006 09:26:45 +0800</pubDate>
<description>&lt;p&gt;Honestly...do a reality check before you post an entry, and ask yourself, is this remotely interesting to the average viewer ? &lt;/p&gt;</description>
</item><item>
<title>nanheyangrouchuan</title>
<link>http://shanghaiist.com/2006/09/02/what_another_fi.php#comment-334523</link>
<guid isPermaLink="true">http://shanghaiist.com/2006/09/02/what_another_fi.php#comment-334523</guid>
<category>Comments</category>
<pubDate>Sun, 03 Sep 2006 23:21:27 +0800</pubDate>
<description>&lt;p&gt;Ignoring the Great Firewall of China
Richard Clayton, Steven J. Murdoch, and Robert N. M. Watson
University of Cambridge, Computer Laboratory, William Gates Building,
15 JJ Thomson Avenue, Cambridge CB3 0FD, United Kingdom
frichard.clayton, steven.murdoch, robert.watsong@cl.cam.ac.uk
Abstract. The so-called \Great Firewall of China&quot; operates, in part,
by inspecting TCP packets for keywords that are to be blocked. If the
keyword is present, TCP reset packets (viz: with the RST 
ag set) are
sent to both endpoints of the connection, which then close. However, because
the original packets are passed through the rewall unscathed, if
the endpoints completely ignore the rewall&apos;s resets, then the connection
will proceed unhindered. Once one connection has been blocked, the rewall
makes further easy-to-evade attempts to block further connections
from the same machine. This latter behaviour can be leveraged into a
denial-of-service attack on third-party machines.
1 Introduction
The People&apos;s Republic of China operates an Internet ltering system which is
widely considered to be one of the most sophisticated in the world [9]. It works,
in part, by inspecting web (HTTP) trac to determine if specic keywords are
present [8]. These keywords relate to matters such as groups that the Chinese
Government has banned, political ideologies that they consider unacceptable and
historical events that the regime does not wish to have discussed.
It is straightforward to determine that the keyword-based blocking is occurring
within the routers that handle the connections between China and the rest
of the world [14]. These routers use devices based upon intrusion detection system
(IDS) technology to determine whether the content of packets matches the
Chinese Government&apos;s ltering rules. If a connection from a client to a webserver
is to be blocked then the router injects forged TCP resets (with the RST 
ag
bit set) into the data streams so that the endpoints will abandon the connection.
Once blocking has begun, it will remain in place for many minutes and
further attempts by the same client to fetch material from same website will
immediately be disallowed by the injection of further forged resets.
In Section 2 of this paper we discuss the methods available to countries that
wish to prevent their citizens from accessing particular Internet content and the
strengths and weaknesses that have been identied by previous investigators. In
Section 3 we present the packet traces we obtained from each endpoint of some
connections that were blocked by the Chinese rewall system. In Section 4 we
propose a model for the operation of this rewall to explain the results we have obtained. Then in Section 5 we show that by ignoring the TCP resets being issued
by the rewall we are able to successfully transfer material that was supposed
to be blocked, and discuss why this may prove dicult for the rewall operators
to address. In Section 6 we show how the blocking action of the rewall can
be leveraged into a denial-of-service attack on third party machines. Finally, in
Section 7, we consider how websites outside of China might make their material
easier to access despite the blocking, and we discuss the merits and demerits of
this method of evading censorship.
2 Content Blocking Systems
Three distinct methods of content blocking { packet dropping, DNS poisoning
and content inspection { have been identied in previous papers by Dornseif [5],
who studied the blocking of right-wing and Nazi material in Nordrhein-Westfalen
and Clayton [3] who studied the hybrid blocking system deployed by BT in the
United Kingdom to block access to paedophile websites.
2.1 Packet Dropping Schemes
In a packet dropping scheme, all trac to specic IP addresses is discarded and
the content hosted there becomes inaccessible. This scheme is low cost and easy
to deploy { rewalls and routers oer the necessary features as standard.
Packet dropping schemes suer from two main problems. Firstly, the list
of IP addresses must be kept up-to-date, which could pose some diculties if
the content provider wishes to make it hard for an ISP to block their websites
(for details of the complexity, see the extensive discussion in [4]). Secondly, the
system can suer from \overblocking&quot; { all of the other websites that share the
same IP address will also be blocked. Edelman [6] investigated the potential
extent of overblocking and found that 69.8% of the websites for .com, .org and
.net domains shared an IP address with 50 or more other websites. Although
some of these domain names will have merely been \parked&quot;, and providing a
generic webpage, the detailed gures show a continuum of diering numbers of
websites per IP address, re
ecting the prevailing commercial practice of hosting
as many websites as possible on every physical machine.
2.2 DNS Poisoning Schemes
In a DNS poisoning scheme, it is arranged that when the Domain Name System
(DNS) is consulted to translate a textual hostname into a numeric IP address,
no answer is returned; or an incorrect answer is given that leads the user to a
generic site that serves up a warning about accessing forbidden content.
These schemes do not suer from overblocking in that no other websites will
be aected when access to a specic host is forbidden. However, it can be dicult
to make them work correctly if all that is to be blocked is a website, and email
contact is still to be permitted. Dornseif demonstrated that all of the ISPs in
his sample had made at least one mistake in implementing DNS poisoning.

2.3 Content Inspection Schemes
Most content inspection schemes work by arranging for all trac to pass through
a proxy which refuses to serve any results for forbidden material. These systems
can be made extremely precise, potentially blocking single web pages or single
images, and permitting everything else to pass through unhindered.
The reason that proxy-based systems are not universally employed is that a
system that can cope with the trac volumes of a major network { or an entire
country { would be extremely expensive. In Pennsylvania USA, a state statute
requiring the blocking of sites adjudged to contain child pornography was struck
down as unconstitutional in September 2004 [13]. For cost reasons, the Pennsylvanian
ISPs had been using a mixture of packet dropping and DNS poisoning.
The resultant overblocking and \prior restraint&quot; were signicant factors in the
court&apos;s decision.
Nevertheless, proxy-based systems have been deployed in countries such as
Saudi Arabia [7], Burma [10] and on specic network providers such as Telenor in
Norway [12]. The UK-based BT system studied by Clayton was a hybrid design,
utilising a low-cost cache, because only the packets destined for relevant IP
addresses would be passed to it. Unfortunately, this permits users to \reverseengineer&quot;
the list of blocked sites. Since these sites provide illegal images of
children, this runs counter to the public policy aim of the system.
An alternative method of performing content inspection uses components
from an Intrusion Detection System (IDS). The IDS equipment inspects the
trac as it passes by and determines whether or not the content is acceptable.
When the content is to be blocked it will arrange for packets to be discarded at
a nearby rewall or, in the case of the Chinese system, it will issue TCP reset
packets so as to cause the oending connection to be closed.
An IDS-based system is signicantly more 
exible than the other schemes,
and it is much less simple to circumvent. Both Dornseif [5] and Clayton [4]
have extensive discussions on how to circumvent the dierent types of content
blocking they identify. However, the IDS approach ought to be able to detect the
trac no matter what evasion scheme is tried, provided that the trac remains
in the clear and is not encrypted or obfuscated in a manner that the IDS cannot
convert to a canonical form before coming to a decision.
3 How the Chinese Firewall Blocks Connections
In our experiments we were accessing a website based in China (within the
Chinese rewall) from several machines based in Cambridge, England (outside
the Chinese rewall). The Chinese rewall system, as currently deployed, is
known to work entirely symmetrically1 { detecting content to be ltered as it
passes in both directions { and by issuing all the commands from the Cambridge
end we avoided any possibility of infringing Chinese law.
1 This symmetry is necessarily present because it permits the rewall to block both
requests that are deemed to be unacceptable and the return of unacceptable content.

3.1 Blocking with Resets
Initially we accessed a simple web page, which arrived in an entirely normal
manner, just as would be expected. As can be seen from the packet dump below,
after the initial TCP three-way handshake (SYN, SYN/ACK, ACK) the client
(using port 53382 in this instance) issues an HTTP GET command to the server&apos;s
http port (tcp/80) for the top level page (/), which is then transferred normally.
We were using Netcat (nc) to issue the request, rather than a web browser,
so that we might avoid extraneous detail. The packet traces were captured by
ethereal, but we present them in a generic format.
cam(53382) ! china(http) [SYN]
china(http) ! cam(53382) [SYN, ACK]
cam(53382) ! china(http) [ACK]
cam(53382) ! china(http) GET / HTTP/1.0
china(http) ! cam(53382) HTTP/1.1 200 OK (text/html) etc. . .
china(http) ! cam(53382) . . . more of the web page
cam(53382) ! china(http) [ACK]
. . . and so on until the page was complete
We then issued a request which included a small fragment of text that we
expected to cause the connection to be blocked, and this promptly occurred:
cam(54190) ! china(http) [SYN]
china(http) ! cam(54190) [SYN, ACK] TTL=39
cam(54190) ! china(http) [ACK]
cam(54190) ! china(http) GET /?falun HTTP/1.0
china(http) ! cam(54190) [RST] TTL=47, seq=1, ack=1
china(http) ! cam(54190) [RST] TTL=47, seq=1461, ack=1
china(http) ! cam(54190) [RST] TTL=47, seq=4381, ack=1
china(http) ! cam(54190) HTTP/1.1 200 OK (text/html) etc. . .
cam(54190) ! china(http) [RST] TTL=64, seq=25, ack zeroed
china(http) ! cam(54190) . . . more of the web page
cam(54190) ! china(http) [RST] TTL=64, seq=25, ack zeroed
china(http) ! cam(54190) [RST] TTL=47, seq=2921, ack=25
The rst three reset packets had sequence values that corresponded to the
sequence number at the start of the GET packet, that value plus 1460 and that
value plus 4380 (3  1460).2 We believe that the rewall sends three dierent
values to try and ensure that the reset is accepted by the sender, even if the
sender has already received ACKs for \full-size&quot; (1460 byte) packets from the
destination. Setting the sequence value of the reset packet \correctly&quot; is necessary
because many implementations of TCP/IP now apply strict checks that the
value is within the expected \window&quot;. The vulnerabilities inherent in failing to
check for a valid sequence value were rst pointed out by Watson in 2004 [15].
The trace also shows part of the web page arriving from the Chinese machine
after the connection had already been aborted (we examine why this occurred
below). The Cambridge machine therefore sent its own TCP resets in response to
2 When we enabled TCP timestamps, and the packets contained 12 bytes of TCP
options, we observed that these values changed to multiples of 1448.

these two (now) unexpected packets. Note that it zeroed the acknowledgement
elds, rather than using a value relative to the randomly chosen initial value.
All of the reset packets arrived with a time-to-live (TTL) eld value of 47,
whereas the packets from the Chinese webserver always had a TTL value of 39,
indicating that they were from a dierent source. If both sources set an initial
value of 64, then this would indicate the resets were generated 8 hops away from
the webserver, which traceroute indicates is the second router within the China
Netcom Corporation network (AS9929) after the trac is passed across from the
Sprint network (AS1239).
We also examined this blocked connection from the point of view of the
Chinese webserver:
cam(54190) ! china(http) [SYN] TTL=42
china(http) ! cam(54190) [SYN, ACK]
cam(54190) ! china(http) [ACK] TTL=42
cam(54190) ! china(http) GET /?falun HTTP/1.0
china(http) ! cam(54190) HTTP/1.1 200 OK (text/html) etc. . .
china(http) ! cam(54190) . . . more of the web page
cam(54190) ! china(http) [RST] TTL=61, seq=25, ack=1
cam(54190) ! china(http) [RST] TTL=61, seq=1485, ack=1
cam(54190) ! china(http) [RST] TTL=61, seq=4405, ack=1
cam(54190) ! china(http) [RST] TTL=61, seq=25, ack=1
cam(54190) ! china(http) [RST] TTL=61, seq=25, ack=2921
cam(54190) ! china(http) [RST] TTL=42, seq=25, ack zeroed
cam(54190) ! china(http) [RST] TTL=42, seq=25, ack zeroed
As can be seen, when the \bad&quot; packet was detected, the rewall also sent
resets to the Chinese machine, but these resets arrived after the GET packet
(and after the response had commenced). The last two resets (with zeroed ack
values), were the ones that were sent by the Cambridge machine.
The other resets (generated because falun was present) arrived at the Chinese
webserver with a TTL value of 61, which is consistent with them being
generated 3 hops away with an initial count of 64. This diers from the 8-hop
oset we observed from Cambridge. However, it is possible that there is more
than one device that is generating resets { or the initial count may have been
adjusted to be dierent from 64. We do not currently have any denitive explanation
for the lack of symmetry that this observation represents.
The rst three blocking resets were also set to a range (+25, +1485, +4405)
of sequence numbers in an attempt to ensure that at least one was accepted,
and in fact the +25 packet will have reset the connection.3 The fourth and
fth resets received can be seen, by examining their acknowledgement values,
to be responses to the two packets that the server managed to send before the
connection was reset.
3 If the resets had arrived before the GET packet, then the resets would not have been
accepted. The server is running FreeBSD and in this stage of a connection its TCP
stack will, to provide protection against denial-of-service attacks, only accept a reset
where the sequence number exactly matches the last acknowledgement sent. Before
the GET arrives that value is +1, and hence all of the resets would be ineective.

3.2 Immediate Reset of Connections
The rewall is not just inspecting content but has other blocking rules as well.
Having made a \bad&quot; connection we found that, for a short period, all web
trac between the same two hosts was blocked, before any determination could
possibly have been made as to the content. This can also be seen in the previous
example { but it applies to new connections as well. For example, immediately
after the example documented above we saw this:
cam(54191) ! china(http) [SYN]
china(http) ! cam(54191) [SYN, ACK] TTL=41
cam(54191) ! china(http) [ACK]
china(http) ! cam(54191) [RST] TTL=49, seq=1
Here the reset packet came from the rewall (which sent a reset to the webserver
as well). If the client manages to send out its GET packet before the
reset arrives from the rewall then multiple resets arrive from the rewall (even
if the GET is entirely innocuous). These are then followed by resets from the
webserver { which usually receives the resets promptly and so it will have torn
down the connection before the GET arrives.
It should be noted that the rewall does not attempt to reset the connection
at the SYN stage but waits for the SYN/ACK. Although the client could immediately
be sent valid reset packets when the SYN is seen, it is only when the
SYN/ACK packet is observed that a reset can be constructed with valid values
for the server to act upon.
In our experiments, we found that the length of time for which a pair of endpoints
would be prevented from communicating was somewhat variable. Sometimes
the blocking would only last for a few minutes, yet at another time the
block would be present for most of an hour. The average value was around 20
minutes, but because we saw signicant clustering of times around specic values
we suspect that dierent rewall system components may be setting dierent
time delays; and hence a better understanding of which component was to handle
our trac would enable us to predict the blocking period fairly accurately.
3.3 Application to other Chinese Networks
We obtained a list of Chinese Autonomous Systems (ASs)4 and from that generated
a list of all Chinese subnets that were present in the global routing table.
We then used a modied tcptraceroute to determine which ASs were handling
trac as it crossed from international networks into China, and from this
learnt the identities of the major Chinese border networks. These turned out
to be: AS4134, AS4837, AS7497, AS9800, AS9808, AS9929, AS17622, AS24301
and AS24489. We then selected an example web server within each of these ASs
and found that similar RST behaviour occurred on all of these networks except
AS24489 (Trans-Eurasia Information Network). From this we conclude that our
results are extremely typical of the \Great Firewall of China&quot;, as it exists in late
May 2006, but are not necessarily universally applicable.
4 http://bgpview.6test.edu.cn/bgp-view/cur ana/ipv4cn/china asnlist.shtml

4 Design of the Chinese Firewall
Based on the results of our experiments, and descriptions of the type of devices
and technologies known to be employed by the Chinese { such as Cisco&apos;s \Secure
Intrusion Detection System&quot; [2] { we propose the following outline model for the
operation of a router that is a part of the Chinese rewall. This model ts
our observations well, but it remains speculative because the Chinese network
providers do not publish any specications of their systems.
When a packet arrives at the router it is placed into an appropriate queue for
onward transmission. The packets are also passed to an out-of-band IDS device
within which their content is inspected. If the packet is considered to be \bad&quot;
by the IDS device (because of a keyword match) then three TCP reset packets
{ with the three dierent sequence numbers { are generated for each endpoint
and given to the router to be tr&lt;/p&gt;</description>
</item><item>
<title>Pirx</title>
<link>http://shanghaiist.com/2006/09/02/what_another_fi.php#comment-333676</link>
<guid isPermaLink="true">http://shanghaiist.com/2006/09/02/what_another_fi.php#comment-333676</guid>
<category>Comments</category>
<pubDate>Sun, 03 Sep 2006 12:42:24 +0800</pubDate>
<description>&lt;p&gt;You are right. Something has changed, they have fiddled with the routers again. 

Clearly the objective is, to pretend an openly accessible Internet to the occasional China visitor. He can easily log on to any site he wishes from his 5-star hotel room, while the average John Doe can not reach any content deemed unsuitable by the censors from the computer in his home.&lt;/p&gt;</description>
</item>
</channel>
</rss>