Posts Tagged ‘internet’

The Basics of the Domain Name System

So what is the Domain Name System or as it’s more commonly known as DNS? Well, very simply it is a huge distributed database which contains the domain names and addresses of all devices on the network. This structure is crucial to the way DNS functions, it allows local control of specific segments - and means that somebody doesn’t control it (or effectively the internet!). Yet data in each of these local segments is available to anyone across the entire database. It’s not the most efficient system, but reasonable levels of performance are produced by replicating the local segments and caching these for quicker resolution.

The database is operated on a client/server configuration, the important part served by Name servers. These Name servers contain information on addresses on specific segments and make that information available to clients and other servers - a process called resolving. Resolvers are often just simple library routines, they create queries on demand and forward them to name servers across a network.

The structure of a DNS database is very similar to the directory tree of any filesystem. It’s easiest to think of these as an inverted tree, at the very top is the root node. Each subsequent node is displayed with a text label which identifies it’s relationship with it’s parent. The second tier of the database contains domain extensions such as com, edu, gov and mil. At the top of the tree the node has a ’null’ label, although in practice and configuration it is written as a single dot - ..

Every domain in this tree has a unique name, it’s name helps identify it’s location in the database. The name is made up of the series of nodes that distinguish it’s location in the database. For example if I establish a new domain called mybesttelly.co.uk - I would be responsible for that small segment - mybesttelly.co.uk and any devices such as computer1.mybesttelly.co.uk, www.mybesttellt.co.uk and mail.mybesttelly.co.uk. I can control nodes below my DNS name but nothing above, although I can delegate authority or redirect my resolution requirements to other servers.

But the underlying requirement of course, is to resolve the domain names with specific IP address (and vice versa). In the early days of networking this resolution was done by a text file called the hosts file, which was replicated across the network. In larger networks and specifically the internet - this is clearly not possible. The sheer size of the file would be enormous and replicating changes throughout the world would be completely impossible. There is much more to this technology of course, DNS lies at the heart of the internet and making it function. Name resolution helps any specific client find the server it is looking for. It is open to abuse and there have been many attacks on the infrastructure and localised DNS poisioning and spooofing attacks. DNS is being improved and developed all the time - you can see some of the advances in things like Dynamic DNS or the Smart DNS applied to devices to make them region free - see here for information.

Without DNS we’d be left with huge lists of IP addresses of our favorite sites, it would be rather hard work. As it is, controlling our client IP addresses is more likely to be an issue as I discovered when I tried to watch Canadian TV in the USA last week but was blocked because I had an American IP address.

For a more thorough introduction you could do a lot worse than the excellent primer on this site - http://www.tcpipguide.com/free/t_TCPIPDomainNameSystemDNS.htm.

 

 

Route Tracking using HTTP Tracing

The HTTP 1.1 protocol has it’s own tracing method implemented called TRACE which provides route tracing on request. Of course many people simply use traceroute commands which are available on most operating systems - however this only tracks hops on the network router level. The HTTP trace method actually provides much more detailed tracking and can even pick up any intermediate servers between the various router hops. This can be very useful in a wide variety of situations:

  • You can exactly determine a proxy route a HTTP request is making.
  • Identify all servers in the chain, including their various versions of software, HTTP and OS.
  • Useful for detecting infinite loops
  • Tracking down any invalid responses on the route.
  • Identify any router, hub or proxy server which is causing a routing issue.

In some senses the TRACE method is very much like that more familiar GET method, but here the target server is simply supplied as a parameter. You can set the maximum number of hops that are to be followed by using the Max-Forwards header. This setting is essential if you need to detect and track the causes of infinite loops without simply getting stuck in them. Without this setting your troubleshooting efforts would be stuck in the loop too, especially when you are dealing with proxy chains. There are other benefits of using this method too, for example imagine trying to troubleshoot issues in an international proxy chain perhaps spread across Europe. Maybe you’ve decided the issue may be between the French or German proxies like in this example - http://thenewproxies.com/german-proxy/.

Well using the TRACE method you can fine tune your troubleshooting even more by using telnet. The HTTP TRACE request can actually be sent using telnet - just loging manually and issue the at the command line. You will then receive a response directly from that specific server in the chain and it will carry a HTTP message in the response that will indicate what happened to the packet when it reached the final server. You can even fine tune even more by using the VIA header which can indicate the exact route taken or to specify a route if you suspect a specific server or router to be the problem.

Source: http://www.youtube.com/watch?v=R-6JjuQGHJw

 

Proxies and Capturing Authentication Credentials

As the administrator of any internet facing network knows, there are certain essentials in allowing your clients access to the internet. Most networks employ a proxy server to help mitigate this risk, but simply installing a proxy and leaving it be is actually going to cause more problems than it solves. For example one essential step you should take is to filter out the users authentication credentials before any packets are forwarded to their origin server. The issues with this should be obvious, at any point a malicious server could capture those authentication credentials and gain access to your network.

This is the same reason that users should never be allowed access to untrusted proxy servers. Any proxy server has the same capabilities - i.e to intercept usernames and passwords. The real issue with untrusted servers is that although you can filter out some authentication credentials for example those needed to pass a firewall or external router, any credentials needed for the destination server must be forwarded. So if the user is accessing internet banking obviously their credentials need to be forwarded to the banking site otherwise they wouldn’t be able to log on. A malicious proxy server could intercept these and store them for exploitation or sale at a future date.

Generally there is little you can do about this, other than advise your users not to use any untrusted proxy servers. There are legitmate reasons for needing to use a proxy though, especially for accessing content or sites blocked by your physical location. Many people across the world use proxies to bypass barriers or censorship, just look at this site for one example - http://www.iplayerabroad.com/

For instance a user may need to access a US proxy server in order to access their US banking site or another geo-restricted site. If needed they should use a paid resource from a reputable company - for instance read this post on US proxy servers. This at least means the users are no putting their own data at risk and of course the overall network they are using.

All proxies that are used either externally or internally should be protected by SSL and be able to handle forwarding this sort of information. Although using this encryption doesn’t completely protect your data, it does make man-in-the-middle attacks much more difficult (but not impossible). In many instances there is no ideal choice, and often you’ll need to transmit usernames and account details over unsecured links. The reality is that security is very much an afterthought in the distributed model of the world wide web, in many situations insecure communication is sadly the standard, all you can do is to minimise the risks at your end.

 

How Does ICAP Work?

In brief, the protocol functions as follows. An HTTP message is passed by an ICAP client to the ICAP server. The server processes the message and sends a reply back to the customer. An ICAP client can be both a Web proxy server or even a Web client. An ICAP server can support services that are expressly requested by customers.

As an instance of the protocol’s use, envision the following situation. An ICAP server implements an access control service : two services, and an antivirus service. Hosts inside a network have access to the Internet via a Web proxy server.

Based on the above situation, the access control service supplied by the ICAP server checks whether a Web client can connect to a Website requested by the client. More particularly, the Web client sends an HTTP request to the proxy server. The access control service of the ICAP server checks if the customer can see or not the site. Eventually, the ICAP server either enables the proxy server to continue with the petition or otherwise, reacts with an informative HTTP message, which is redirected to the Web client by the proxy server.

The service, on the flip side, checks whether information passed through the proxy server are impacted with a virus. The ICAP server scans the incoming information for viruses. The ICAP server responds with a Web page telling the user about the difficulty, if a virus is detected. In order to improve the checks it’s best to send the test virus from a variety of sources. So for example you could buy a US IP and generate the test virus from an American server, in order to protect the perimeter. Many IT Security professionals routinely buy proxy services and VPNs in order to test the integrity of both their internal and external security.

The ICAP protocol is easily extended so that it could control other kinds of info rather than just HTTP requests and answers. For instance, it might be expanded to manage email messages. The format of an e-mail message is just like the format of an HTTP reply. In general, every object or piece of data can be called an HTTP object. For instance, a simple file can be enclosed into an item that contains the real content of the file in addition to file descriptors (Content - Length, Content - Type ) in the and Day, Content Language, kind of HTTP headers.

XML Sitemaps - A Brief Introduction

Google was the very first search engine to introduce sitemaps with the Google XML sitemap format in 2005. As the internet evolved, so the standard evolved with all the search engines and it soon became the standard for all these search giants on how to complete a crawl of a website and of course the subsequent indexing of the site. Essentially an XML sitemap is merely an XML file containing a listing of a site that includes URLs and some info about those URLs. A site can have multiple sitemaps all stored in multiple directories, so it’s useful if you can test from a variety of sources perhaps through VPNs and proxies like this if you have the facility available.To assist search engines detect the various sitemaps an organization may have, the locations of the XML file are detailed at the end of a site’s robots.txt file. Make sure that the search engine can access all these pages using a standard TCP/IP connection.

An XML sitemap is ideal for sites where some pages are updated more often or where some pages are more significant than many others. For instance, a local company might update its opening hours or product lists quite often, while infrequently updating the page describing the history of its company. In that case, the webmaster would need to notify search engines when it does its ordinary site crawling to put a higher priority in the hours page. Likewise, the webmaster can put an increased importance on the hours pages or perhaps on other pages with particular content, so the search engine’s site indexing ranks those pages greater.

Sitemaps should comprise the date a page was last altered, how often that page changes and that page’s priority. The last modified date is merely the calendar date the page last changed. The precedence is a value from zero to 1 with a default of 0.5. Writing out this advice for each page isn’t challenging, but it could be boring. Using an XML sitemap generator can help decrease the quantity of work a webmaster has to do when creating the sitemap. While other web sites provide offline generators several web sites provide an on-line sitemap generator.

In case your site features many thousands of web pages, you must utilize a professional sitemap creator instead. It can save you a huge amount of time and in some cases your sanity! They’re not expensive in fact I used one in Paris for many years which was completely free although it was restricted to French IP addresses only, so I had to use a proxy in France like this.

Although a sitemap is frequently overlooked, it is an essential resource which helps search engines understand sites. Sitemaps can be basic or complicated, conditioned upon the site’s size and demands.