Understanding Internet addresses (URLs)
Each document and resource in the Internet is identified by a unique address
that distinguishes it from other documents and resources. This address is also
called Uniform Resource Locator (URL). In this section we'll discuss only
addresses for web resources (http and https protocols) insomuch as to get
sufficient understanding for ads and cookie blocking.
You may see the address of the document you browse currently in Address box
in Internet Explorer Address Bar. When you move the mouse pointer over links on
a page, you may see the address of the documents these links point to in
Internet Explorer status bar; however, some sites do not allow this. All
pictures (banners), popup windows and other page elements (resources) have their
own addresses. Banner, popup and cookie blocking is based on their addresses so
it is necessary to have basic understanding of web addresses.
The simplified web address structure is:
protocol://host/folder1/folder2/resource1?param1=value1¶m2=value2
For example, this address:
http://somedomain.com/images/banners/ads.dll?ad=1222&lang=10
has following parts:
-
http:// - protocol;
-
somedomain.com - host;
-
images, banners - folders;
-
ads.dll - resource;
-
ad, 1222 - the first parameter and its value;
-
lang, 10 - the second parameter and its value.
These address parts are described further:
-
protocol aka scheme - protocol (set of rules) used to
retrieve a resource from the Internet. For web addresses, two protocols are
widely used: http and https. http protocol is used for retrieval of insecure
documents while https is used for secure documents;
-
host - computer (host) name or its IP
address from which a resource can be retrieved. These names are registered by
site owners (and advertisers, too) within a special organization. This name
usually identifies a site owner. An advertiser can have multiple host
addresses registered but their number is usually limited. Due to this fact Ad Annihilator is able to identify resources that come from advertisers.
The host part itself can consist of several parts separated by periods, e.g.
http://ads.images.somedomain.com. The rightmost part is named top- or
first-level domain, the next to the right is called second-level domain and so
on. The domains to the left of the given domain are called subdomains as well.
The number of top-level domains are restricted. For example, there are com,
net, org, edu and other top-level domains. Site owners usually register their
domains within the second level, e.g. http://advert.com. Some site owners are
not primary advertisers by themselves but provide advertising services as
their secondary activity. They often create third-level domains within their
site name for serving ad content, for example, http://ads.somedomain.com, and
even fourth- and so on level domains. Another way used to provide
advertisements for the latter kind of advertisers is creating special
folders for ad serving;
-
folder - the hierarchical path to the
resource on the host computer similar to file system path used to identify a
location of the resource on the web server. It may contains several parts
separated with slashes (/). The folder part in the address may be empty.
As it was mentioned above for host part, some advertisers
may use folders to designate a location from which their ad content comes, for
example, http://somedomain.com/banner;
-
resource - resource name, a name unique within the context
specified by folder path that designates a web document or resource. Often
this part designates not a static document or resource but a dynamic one
generated by computer program on the web server. These programs are called CGI
applications. The resource part may be empty, too.
Sometimes resource part is used by advertises to serve advertisements from,
e.g. http://somedomain.com/images/ads.aspx;
-
parameters and their values - CGI applications often accept
parameters to modify a resource that should be retrieved. Each parameter
consists of its name used to distinguish it from other parameters and value
that specifies the information carried by the parameter.
If a CGI application is used to serve ad content, parameters often used for
specification of which specific advertisement the application should be returned.
Usually it is not required to pay attention to parameters of ad serving CGI
applications as users usually want to filter all ads.
Understanding of Internet addresses is required for
banner,
popup and
cookie filters understanding and
for banner filter mask selection in Resource
block, Resource unblock,
Popup block and
Popup unblock windows.
|