Data communication and computer networks


Topic 3: methods of data communication and the principles of computer networks
  • å ISO/ISO seven layer model
  • Protocol and network standards
  • Technical requirements for networks
  • Telecommunication standards
  • Electronic Data Interchange/Health Level 7
  • Internet

More precisely: what to read

Situation January 1, 2013.

  • van Bemmel, Chapter 5.
  • Shortliffe, Chapter 5:
    • pages: 200 - 207
      Start 200 with "Local Data Communication"
      End 207 at "Software"
    • pages: 217 - 222
      Start 217 with "Software for Network Communication"
      End 222 at "Data Acquisition and  Signal Processing"

van Bemmel, Chapter 5

asynchronous application: does not wait for confirmation from the server, this type of communication is generally used for e-mail
synchronous application: requires the server to confirm the receipt of the message, either by sending an explicit confirmation message or by sending a preliminary reply that the message has arrived, this connection mode is used by database applications
real time application: require the server to respond within a certain time frame, so that the information displayed to the client is almost instantaneous, an example is client computers that register vital signs

Shortliffe, Chapter 5, Local Data Communication

Communication can occur via telephone lines, dedicated or shared wires, fiber-optic cables, infrared, or radio waves. 
modem (modulator-demodulator) converts the digital data from a computer to analog signals in the voice range
The overall bit rate of a communication link is a combination of the rate at which signals (or symbols) can be transmitted and the efficiency with which digital information (in the form of bits) is encoded in the symbols. 

a 56,000 bit per second (bps) modem may use a signal rate of only 8,000 baud and an encoding that transmits up to 8 bits per signal. For graphics information, speeds of more than 500,000 bps are desirable.
  
Digital Subscriber Line (DSL), Integrated Services Digital Network (ISDN) 

Frame Relay is a network protocol designed for sending digital information over shared, wide-area networks (WANs)

Asynchronous Transfer Mode (ATM) is a protocol designed for sending streams of small, fixed-length cells of information (each 53 bytes long) over very high-speed dedicated connections—most often digital optical circuits. 

local area network (LAN) allows local data communication without involving the telephone company or network access provider.

file servers: computers dedicated to storing local files, both shared and private

There are a variety of protocols and technologies for implementing LANs
Typically data are transmitted as messages or packets of data; each packet contains the data to be sent, the network addresses of the sending and receiving nodes, and other control information. LANs are limited to operating within a geographical area of at most a few miles and often are restricted to a specific building or a single department. Separate remote LANs may be connected by bridges, routers, or switches
Early LANs used coaxial cables as the communication medium because they could deliver reliable, high-speed communications. With improved communication signal–processing technologies, however, twisted-pair wires (Cat-5 and better quality) have become the standard. Twisted-pair wiring is inexpensive and has a high bandwidth (capacity for information transmission)
fiber-optic cable, offers the higest bandwidth (over 1 billion bps or 1 Gbps) and a high degree of reliability because it uses light waves to transmit information signals and is not susceptible to electrical interference.
Fiber-optic cable is used in LANs to increase transmission speeds and distances by at least one order of magni- tude over twisted-pair wire. In addition, fiber-optic cable is lightweight and easy to install. Splicing and connecting into optical cable is more difficult than into twisted-pair wire, however, so inhouse delivery of networking services to the desktop is still easier using twisted-pair wires. Fiber-optic cable and twisted-pair wires are often used in a complementary fashion—fiber-optic cable for the high-speed, shared backbone of an enterprise network or LAN and twisted-pair wires extending out from side-branch hubs to bring service to the workplace. 

When coaxial cable installations are in place (e.g., in closed circuit television or in cable television services), LANs using coaxial cable can transmit signals using either broadband or baseband technology. 

Broadband is adapted from the technology for transmitting cable television. A broadband LAN can transmit multiple signals simultaneously, providing a unified environment for sending computer data, voice messages, and images. Cable modems provide the means for encoding and decoding the data, and each signal is sent within an assigned frequency range (channel). Baseband is simpler and is used in most LAN installations. It transmits digital signals over a single set of wires, one packet at a time, without special encoding as a television signal.

A router or a switch is a special device that is connected to more than one network and is equipped to forward packets that originate on one network segment to machines that have addresses on another network. Gateways perform routing and can also translate packet formats if the two connected networks run different communication protocols. 

Internet Communication

External routers can also link the users on a LAN to a regional network and then to the Internet.  The Internet is a WAN that is composed of many regional and local networks interconnected by long-range backbone links, including international links. Internet service provider (ISP) gets WAN access through a network access provider (NAP).

All Internet participants agree on many conventions called Internet standards. The most fundamental is the protocol suite referred to as the Transmission Control Protocol/Internet Protocol (TCP/IP). Data transmission is always by structured packets, and all machines are currently identified by a standard for 32-bit IP addresses. Internet addresses consist of a sequence of four 8-bit numbers, each ranging from 0 to 255— most often written as a dotted sequence of numbers: a.b.c.d.
Although IP addresses are not assigned geographically (the way ZIP codes are), the first number identifies a region, the second a local area, the third a local net, and the fourth a specific computer. Computers that are permanently linked into the Internet may have a fixed IP address assigned, whereas users whose machines reach the Internet by dialing into an ISP or making a wireless connection only when needed, may get a temporary address that persists just during a session. The Internet is in the process of changing to a protocol (IPv6) that supports 64-bit IP addresses, because the worldwide expansion of the Internet and proliferation of networked individual computer devices is exhausting the old 32-bit address space. While the changeover is complex, much work has gone into making this transition transparent to the user. 

Because 32-bit (or 64-bit) numbers are difficult to remember, computers on the Internet also have names assigned. Multiple names may be used for a given computer that performs distinct services. The names can be translated to IP addresses—e.g., when they are used to designate a remote machine—by means of a hierarchical name management system called the Domain Name System (DNS). Designated computers, called name-servers, convert a name into an IP address before the message is placed on the network; routing takes place based on only the numeric IP address.

The internet is growing rapidly, therefore, periodic reorganizations of parts of the network are common. Numeric IP addresses may have to change, but the logical name for a resource can stay the same and the (updated) DNS can take care of keeping the translation up to date. This overall process is governed today by the Internet Corporation for Assigned Names and Numbers (ICANN). Three conventions are in use for composing Internet names from segments:

  1. Functional convention: Under the most common convention for the United States, names are composed of hierarchical segments increasing in specificity from right to left, beginning with one of the top-level domain-class identifiers—e.g., computer.insti- tution.class (smi.stanford.edu) or institution.class (whitehouse.gov). Initially the defined top-level domain classes were .com, .edu, .gov, .int, .mil, .org, and .net (for commercial, educational, government, international organizations, military, non- profit, and ISP organizations, respectively). As Internet use has grown, more classes are being added—seven more have become fully operational in 2002 (.aero, .biz, .coop, .info, .museum, .name, and .pro). This list will likely grow further in the future. Note that these functional top-level domain names (or classes) have three or more characters. Name hierarchies can be as deep as desired, but simplicity helps users to remember the names. Other conventions have evolved as well: www is often used as a prefix to name the World Wide Web (WWW) services on a computer (e.g., www.nlm.nih.gov) 
  2. Geographic convention: Names are composed of hierarchical segments increasing in specificity from right to left and beginning with a two-character top-level country domain identifier—e.g., institution.town.state.country (cnri.reston.va.us or city.palo- alto.ca.us). Many countries outside of the United States use a combination of these conventions, such as csd.abdn.ac.uk, for the Computer Science Department at the University of Aberdeen (an academic institution in the United Kingdom). Note that the case of an IP address is ignored, although additional fields, such as file names used to locate Web content resources, may be case-sensitive. 
  3. Attribute list address (X.400) convention: Names are composed of a sequence of attribute-value pairs that specifies the components needed to resolve the address— e.g., /C5GB/ADMD5BT/PRMD5AC/O5Abdn/OU5csd/, which is equivalent to the address csd.abdn.ac.uk. This convention derives from the X.400 address standard that is used mainly in the European community. It has the advantage that the address elements (e.g., /C for Country name, /ADMD for Administrative Management Domain name, and /PRMD for Private Management Domain name) are explicitly labeled and may come in any order. Country designations differ as well. However, this type of address is generally more difficult for humans to understand and has not been adopted broadly in the Internet community.

The routing of packets of information between computers on the Internet is the basis for a rich array of information services. Each such service—be it resource naming, electronic mail, file transfer, remote computer log in, World Wide Web, or another service— is defined in terms of sets of protocols that govern how computers speak to each other. These worldwide intercomputer-linkage conventions allow global sharing of information resources, as well as personal and group communications. 

Software for Network Communication

Network power is realized by means of a large body of communications software. This software handles the physical connection of each computer to the network, the internal preparation of data to be sent or received over the network, and the interfaces between the network data flow and applications programs. 

Network service stacks and network protocols allow communication to take place between any two machines on the Internet, ensure that application programs are insulated from changes in the network infrastructure, and make it possible for users to take advantage easily of the rapidly growing set of information resources and services. 
The network stack serves to organize communications software within a machine. Because the responsibilities for network communications are divided into different levels, with clear interfaces between the levels, network software is made more modular.

The four-level network stack for TCP/IP

At the lowest level—the Data Link and Physical Transport level—programs manage the physical connection of the machine to the network, the physical-medium packet for- mats, and the means for detecting and correcting errors. 
The Network level implements the IP method of addressing packets, routing packets, and controlling the timing and sequencing of transmissions.
The Transport level converts packet-level communications into several services for the Application level, including a reliable serial byte stream  (TCP), a transaction-oriented User Datagram Protocol (UDP), and newer services such as real-time video. 
The Application level is where programs run that support electronic mail, file sharing and transfer, Web posting, downloading, browsing, and many other services. 

If a computer changes its network connection from a Token Ring to an Ethernet network, or if the topology of the network changes, the applications are unaffected. Only the lower level Data Link and Network layers need to be updated.

Internet protocols are shared conventions that serve to standardize communications between machines—much as, for two people to communicate effectively, they must agree on the syntax and meaning of the words they are using, the style of the interaction (lec- ture versus conversation), a procedure for handling interruptions, and so on.

Protocols are defined for every Internet service (such as routing, electronic mail, and Web access) and establish the conventions for representing data, for requesting an action, and for replying to a requested action. For example, protocols define the format conventions for e-mail addresses and text messages (RFC822), the attachment of multimedia content (Multipurpose Internet Mail Extensions (MIME)), the delivery of e-mail messages (Simple Mail Transport Protocol (SMTP)), the transfer of files (File Transfer Protocol (FTP)), connections to remote computers (Telnet), the formatting of Web pages (Hypertext Markup Language (HTML)), the exchange of routing information, and many more. By observing these protocols, machines of different types can communicate openly and can interoperate with each other. When requesting a Web page from a server using the Hypertext Transfer Protocol (HTTP), the client does not have to know whether the server is a UNIX machine, a Windows machine, or a mainframe running VMS—they all appear the same over the network if they adhere to the HTTP protocol. The layering of the network stack is also supported by protocols. As we said, within a machine, each layer communicates with only the layer directly above or below. Between machines, each layer communicates with only its peer layer on the other machine, using a defined protocol. For example, the SMTP application on one machine communicates with only an SMTP application on a remote machine. Similarly, the Network layer com- municates with only peer Network layers, for example, to exchange routing information or control information using the Internet Control Message Protocol (ICMP).


Basic services available on the Internet:

  1. Electronic mail: Users send and receive messages from other users via electronic mail, mimicking use of the postal service. The messages travel rapidly: except for queuing delays at gateways and receiving computers, their transmission is nearly instantaneous. Electronic mail was one of the first protocols invented for the Internet (around 1970, when what was to become the Internet was still called the ARPANET). A simple e-mail message consists of a header and a body. The header contains information formatted according to the RFC822 protocol, which controls the appearance of the date and time of the message, the address of the sender, addresses of the recipients, the subject line, and other optional header lines. The body of the message contains free text. The user addresses the e-mail directly to the intended reader by giving the reader’s account name or a personal alias followed by the IP address of the machine on which the reader receives mail—e.g., JohnSmith@IP.address. If the body of the e-mail message is encoded according to the MIME standard it may also contain arbitrary multimedia information, such as drawings, pictures, sound, or video. Mail is sent to the recipient using the SMTP standard. It may either be read on the machine holding the addressee’s account or it may be downloaded to the addressee’s PC for reading using either the Post Office Protocol (POP) or the Internet Mail Access Protocol (IMAP). Some mail protocols allow the sender to specify an acknowledgment to be returned when the mail has been deposited or has been read. Electronic mail has become an important communication path in health care, allowing asynchronous, one-way, communications between participants. Requests for services, papers, meetings, and collaborative exchanges are now largely handled by electronic mail (Lederberg, 1978). It is easy to broadcast electronic mail by sending it to a mailing list or a specific list- server, but electronic mail etiquette conventions dictate that such communications be focused and relevant. Spamming, which is sending e-mail solicitations or announcements to broad lists, is annoying to recipients, but is difficult to prevent. Conventional e-mail is sent in clear text over the network so that anyone observing network traffic can read its contents. Protocols for encrypted e-mail, such as Privacy-Enhanced Mail (PEM) or encrypting attachments, are also available, but are not yet widely deployed. They ensure that the contents are readable by only the intended recipients. 
  2. File Transfer Protocol (FTP): FTP facilitates sending and retrieving large amounts of information—of a size that is uncomfortably large for electronic mail. For instance, programs and updates to programs, complete medical records, papers with many figures or images for review, and the like are best transferred via FTP. FTP access requires several steps: (1) accessing the remote computer using the IP address; (2) providing user identification to authorize access; (3) specifying the name of a file to be sent or fetched using the file-naming convention at the destination site; and (4) transferring the data. For open sharing of information by means of FTP sites, the user identification is by convention “anonymous” and the requestor’s e-mail address is used as the password.
  3. Telnet: Telnet allows a user to log in on a remote computer. If the log-in is successful, the user becomes a fully qualified user of the remote system, and the user’s own machine becomes a relatively passive terminal. The smoothness of such a terminal emulation varies depending on the differences between the local and remote computers. Many Telnet programs emulate well-known terminal types, such as the VT100, which are widely supported and minimize awkward mismatches of character-use conventions. Modest amounts of information can be brought into the user’s machine by copying data displayed in the terminal window into a local text editor or other program (i.e., by copying and pasting).
  4. World Wide Web (WWW): Web browsing facilitates user access to remote information resources made available by Web servers. The user interface is typically a Web browser that understands the basic World Wide Web protocols. The Universal Resource Locator (URL) is used to specify where a resource is located in terms of the protocol to be used, the domain name of the machine it is on, and the name of the information resource within the remote machine. The HTML describes what the information should look like when displayed. These formats are oriented toward graphic displays, and greatly exceed the capabilities associated with Telnet character- oriented displays. The HTML supports conventional text, font settings, headings, lists, tables, and other display specifications. Within HTML documents, highlighted buttons can be defined that point to other HTML documents or services. This hyper- text facility makes it possible to create a web of cross-referenced works that can be navigated by the user. The HTML can also refer to subsidiary documents that contain other types of information—e.g., graphics, equations, images, video, speech— that can be seen or heard if the browser has been augmented with helpers or plug-ins for the particular format used. Browsers, such as Netscape Navigator or Internet Explorer, also provide choices for downloading the presented information so that no separate FTP tasks need to be initiated. The HTTP is used to communicate between browser clients and servers and to retrieve HTML documents. Such communications can be encrypted to protect sensitive contents of interations (e.g., credit card information or patient information) from external view using protocols such as Secure Sockets Layer (SSL). 

A client–server interaction is a generalization of the four interactions we have just discussed, involving interactions between a client (requesting) machine and a server (responding) machine. A client–server interaction, in general, supports collaboration between the user of a local machine and a remote computer. The server provides information and computational services according to some protocol, and the user’s computer—the client—generates requests and does complementary processing (such as displaying HTML documents and images). A common function provided by servers is database access. Retrieved information is transferred to the client in response to requests, and then the client may perform specialized analyses on the data. The final results can be stored locally, printed, or mailed to other users.



More precisely: what to read about security in Information Systems

Situation January 1, 2013.

  • van Bemmel, Chapter 33
  • Shortliffe, Chapter 5:
    • pages: 226 - 230 Start 226 with "Data and System Security"
      End 230 at "Summary"


van Bommel, Chapter 33

purpose of data protection measures is the protection of:
privacy,
quality of medical data and software,
availability of data and function

confidentiality - access rights of users must be defined to protect confidentiality, threats: when data passed in the wrong hands
record linkage - collecting data relating to the same person in various files

integrity - threats: data is inconsistent or their contents are corrupted

availability - system is available at leas 99.7% of 24/24, threats: failure in equipment, insufficient environmental facilities to enable flawless functioning of a software

Measures may improve data protections in two ways:
may reduce the probability that something going wrong (fire prevention, password, limited access of a location where the computer is)
reduce the level of damage in case something does wrong (backup files, setting up and rehearsing crisis procedure)

Equipment measures:
computer center that can be well locked,
fire protection,
air-conditioning,
protection against flooding

Software measures:
verifiability of software,
well-tested database management systems
identification of users,
prohibition of easy passwords
log book

Organisational measures
separation of duties,
handbook of operations,
management of authorisations


Shortliffe, Chapter 5, Data and System Security

three separate concepts involved in protecting health care information:
  1. Privacy refers to the desire of a person to control disclosure of personal health and other information.
  2. Confidentiality applies to information—in this context, the ability of a person to control the release of his or her personal health information to a care provider or information custodian under an agreement that limits the further release or use of that information.
  3. Security is the protection of privacy and confidentiality through a collection of policies, procedures, and safeguards. Security measures enable an organization to maintain the integrity and availability of information systems and to control access to these systems’ contents. 

In general, the security steps taken in a health care information system serve five key functions:
  1. Availability ensures that accurate and up-to-date information is available when needed at appropriate places.
  2. Accountability helps to ensure that users are responsible for their access to, and use of, information based on a documented need and right to know. 
  3. Perimeter definition allows the system to control the boundaries of trusted access to an information system, both physically and logically. 
  4. Role-limited access enables access for personnel to only that information essential to the performance of their jobs and limits the real or perceived temptation to access information beyond a legitimate need. 
  5. Comprehensibility and control ensures that record owners, data stewards, and patients can understand and have effective control over appropriate aspects of information confidentiality and access.


Technical means to ensure accountability include two additional functions: authentication and authorization.
The user is authenticated through a positive and unique identification process, such as name and password combination.
The authenticated user is authorized within the system to perform only certain actions appropriate to his or her role in the health care system—e.g., to search through certain medical records of only patients under his or her care.

Virtual Private Network (VPN) technologies offer a powerful way to let bona fide users access information resources remotely. Using a client–server approach, an encrypted communication link is negotiated between the user’s client machine and an enterprise server. This approach protects all communications and uses strong authenti- cation to identify the user.

Cryptographic encoding is a primary tool for protecting data that are stored and are transmitted over communication lines. Two kinds of cryptography are in common use— secret-key cryptography and public-key cryptography. In secret-key cryptography, the same key is used to encrypt and to decrypt information. Thus, the key must be kept secret, known to only the sender and intended receiver of information. In public-key cryptography, two keys are used, one to encrypt the information and a second to decrypt it. Because two keys are involved, only one need be kept secret. The other one can be made publicly available. This arrangement leads to important services in addition to the exchange of sensitive information, such as digital signatures (to certify authorship), content validation (to prove that the contents of a message have not been changed), and nonrepudiation (to ensure that an order or payment for goods received cannot be dis- claimed). Under either scheme, once data are encrypted, a key is needed to decode and make the information legible and suitable for processing.

Single-layer encryption with keys of 56-bit length (the length prescribed by the 1975 Data Encryption Standard (DES)) are no longer considered secure, and keys of 128 bits are routine.

audit trails - security-relevant chronological record, set of records, or destination and source of records that provide documentary evidence of the sequence of activities that have affected at any time a specific operation, procedure, or event.
Comments