Chosen Methods

In this chapter we'll deal with principles of chosen methods in:

Database structure

We decided to break the database into several files, all defined in the configuration files. The files would be ASCII tables representing relationships between two entities of the LAN. Thus they would represent a large table of all nodes related to each other through those files. This is not distributing the database, it is rather a simplifying of it, only worth doing in our case when we're dealing with ASCII data. The scan of such narrow file would be much sooner.

Then we selected the main key to our database. That was easy, since any important node (data supplier) in LAN has a unique indentifier - the MAC address of the Network adapter. Thus the tables would consist of pairs MAC address and corresponding to it data in LAN node - a name, an IP, or a TO etc. The unique MAC address will bridge all kinds of data between themselves. It will also solve a problem of a computer having several network adapters - each one of them would be unique! Hubs and routers keep MAC addresses in their caches and so we may dump a list of active MAC addresses from them.

Data locating and collecting

The LAN data is located as follows within the entities we defined in previous chapters:

During the entire time of the collection log files are being updated about possible errors happening.

Each data collection procedure is a separate script, named after the data it collects - like mac-ip.pl or name-user.pl

Data analysis and arranging

First we dump the collected data into temporary files, where they are arranged in the collection pairs - MAC vs. IP, IP vs. name etc. We analyze the data for validity.

IMPORTANT:
A Hub may often see several MAC addresses connected to a slot. Sometimes it is true (in case of usage of a mini-Hub, which, unfortunately does not support SNMP), but sometimes it is not. After a thorough analysis and several iterations we arrived to a conclusion, that if among those several stations a router is detected, then the router is the only valid connection and the rest are just bypassers. Still this statement is an engineering approximation and should be treated carefully, sometimes involving a visit into a communication cabinet for verification.

We discard the extra data - like aliased computer names or multiple IP numbers. We define a NONAME and a NOUSER constants for computers with no name and no user-owner.

Then we start rearranging the data into pairs with MAC address each. Since the derivation of the data was started with MAC and then went tree-like down, it is always possible! Then the files are written and they receive names indicating their contents - like mac-ip.csv or mac-user.csv. The csv extenstion is a convention accepted of ASCII database files. The data is comma delimited and ready for scan!

The arrangement includes quite a cunning piece of work to make it work quickly, but in the end it was a reliable mechanism. It covers cutting down the tree branches as at some stage the chain would get lost (no computer was detected with an IP in the nameserver for example) and may be as long as the configuration files tells.

General Query ideas

First we defined the Query format (not neccessarily command-line, but the data passed to the query functions in the project). We decided on following outlook:

query_function type_of_queried_argument type_of_requested_data argument
Thus we define in the function pass the type of queried arguments and the data requested. The type_of_queried_argument and the type_of_requested_data are similar in nature, in fact you may ask the thing about itself. The Multiple or Ranged query is just a multiplication of a simple query, so we didn't have to do the work twice. We did not implement a search of two elements of different data types since it was beyond a protype scope.

Search within the queries

The search is very simple. The given argument is looked in the datafile according to the given file just by string-string scanning and thus its MAC address is located. If the given argument is the MAC itself then the scanned file is of MAC and the first requested data type. The MAC is cached and we scan again, string by string in the file of MAC vs. requested data, type by type and upon locating needed data keep it for the output in the end. It is like climbing a post, going over a crossbar which the MAC serves and then going down another post.

The mechanism is a bit slow, but very reliable and good for the prototype. Later, when the real thing will be going for installation and a binary database software would be used, specific search queries would be available.

If no specific data was found, a corresponding message is printed. If no suitable data was found at all a typo in argument is considered to have occurred.

Run each time or daemon?

Here we'll compare two possible ways of handling data and searches. One way, which we did not use was running the search engine as a daemon and holding all the data in quick-accessable hashes of PERL. True, then the search would run much faster. But, such a daemon would present a load over the Network administration computer, avoiding which in prototype was a major priority, and on the other hand an asynchronization between updated database on disk and running daemon in memory would sometimes occur. So, we decided that for a prototype a slower, but more reliable and clear method - the second one should be applied.

Interface inplementation concepts

In CLI no real concepts were applied. The call of the program is very similar to the function call described above. A special care was taken about program output, so that it could be parsed by other stream processors like shell or awk. The usage notification is similar to other Unix commands, the usage itself - the dividing of the command line into a command, two subcommands of definition of type of given argument and requested data, and the argument itself - all this is very recognizable and easy to handle for the Unix user like the Network Administrator.

In GUI HTML concepts were taken into consideration to present a finely viewable table with form into which the data would be inserted by the user. The output is also presented in a table with headers, so that the user understands it well. HTML is a format which is accepted by many browsers on almost all computer platforms, therefore this GUI should be considered platform independent.

Installation GUI is similar to the Query GUI. We avoided to propose default values for the parameters, considering it wiser to leave full filling-in to the Network Administrators


romm@empire.tau.ac.il
Last modified: Thu Jun 5 06:49:09 1997