Get a sense of databases

Nowadays we are experiencing the golden era of data science as an increasingly huge explosion of data is being created every day. Actually IBM, one of the leaders in Information Technology (IT) industry, has estimated that 2.5 billion gigabytes of data was being generated every day in 2012. Because of the surge of data generation, the importance of databases has further boosted. However if you desire to become a database expert you have to study a lot. This article gives the general idea of relational databases.

What is a database?
A database is an organized and complete collection of data. Simply speaking, it is the place where your data are stored and easily retrieved.

Why and how is a database organized?
The data must be structured in an appropriate order so you can easily extract or update them. A relational database structures data as a collection of tables where each table contains similar data. Student’s data will be stored in a different table from lesson’s data.
Apart from similar data stored in a table, data should be complete, in order to construct a consistent database. For instance, if you want to create the database of YouTube and you have created two tables, one for videos and one for users,you could not get information such as “what videos have obscene comments which must be deleted?”, as the table that stores the comments is missing. That is called deficiency of data. I’ll take an attempt to explain it with an example from everyday work place.
Imagine yourself as a receptionist in the Hilton Hotel and today you are ordered to send a suitcase that a faithful customer has forgotten. Let’s say that you haven’t written his address in the table “Client” (where you store the information about clients) and neither his telephone. Such a mess! Now you have to deal with your boss (the manager of the hotel possibly) and you are doomed to discharge due to the fact that you haven’t fully stored all the data you need.

How to query the database?
Assuming you have a relational database that stores the weather for each week and you want to take some answers regarding the number of sunny days in January. How will you achieve a communication with the database? You need to speak the same language and until we teach databases and computers how to speak English (ok a cold joke) we should learn their language which is SQL (Structured Query Language). Don’t be terrified! SQL has 24 commands in total that allow you to ‘play’ with your data and it is really easy to learn as it ultimately is just about usage of English.
For example you are a secretary in a university and you want to select the names of the students that register this year. You just type “SELECT name FROM students”. You will get bored typing the words SELECT and FROM if you finally put yourself in the world of databases. Supposing that you want to delete the table lessons from your database (this is not a realistic example, lessons will never be removed from a university but anyway) you will just simply type DROP TABLE lessons. So you are a serious businessman and want to find the names and the city of your customers, how will accomplished that?

What’s happening in the market?
Most people are familiar with relational databases as they are the most widespread. There are three common types of databases used today. Relational, Object-Oriented and NoSQL databases. Ok, there are cloud, memory and graph databases but they all fall out of this introductory tut.
Relational databases are the most prevalent approach and they were first proposed in 1970. As we previously said, they are composed of tables where each table stores similar data about an entity. A table contains fields or attributes (the entity’s columns) and records (the entity’s rows). The fields contain the data that are stored and the tables are the structure of the database. Each entity has a Primary Key (PK) that identifies the uniqueness of each row in a table. In simple words the primary key is a column and it is differentiated in each row. For example if we have two users with the exactly same name in our table of students of the University database, we couldn’t define who of the two has passed the lesson ‘Introduction to databases’ unless we have a PK.
Secondly we have object-oriented databases which ensued from object oriented programming (if you ever heard of Java, Python, C++ etc. this is OOP). The difference is that the data here are defined as classes and objects. Particularly, a class refers to a table and an object/instance refers to a record. Except for SQL, such databases employ also languages like Java and other so they are friendlier to programmers. It is true that they are the least prevalent in the market.
A third category is still emerging today: NoSQL which stands for ”Non SQL” without this referring that necessarily SQL isn’t used . NoSQL databases are often very fast and they are used for managing large scale data common know as Big Data. The data here are stored on key/values, documents and graphs. The most challenging part is that they use APIs instead of direct SQL queries. Application Programming Interfaces are software programs that enable an easy interaction with other software programs. NoSQL and APIs are considered to be modern technologies and they have already gained huge popularity. Indeed this technology is a topic that has a lot to say and so that are just the really basic theory.

As we said the database is where all the information is stored.
But how a user or other applications interact with the database?

The database management system (DBMS) is the software program that enables processes like the maintenance and the creation of databases. Moreover it defines some rules like the information policy to the database. For example you want to show the data of your website to a guest but you don’t want to give him the right to delete some of the data, and this is what called role-based access to the data. Depending on what type of data you want to store, you select the suitable DBMS. The most well-known relational DBMS are MySQL, Oracle DBMS, Microsoft SQL Server, IBM DB2 and PostgreSQL. Some object oriented DBMS are O2, Objectivity/DB and Gemstonehe. And the most representative of NoSQL are MongoDB (document), DynamoDB(key value) and GraphBase (graph).
DBS
Ok I can manage the database by buying a DBMS and learning how to use it. But how does a database connect to the web?
The architecture is like a chain. A client is the user’s interface and if we are talking about web-based applications then is the browser. But client could also be the database administrator or the application graphical page which want to communicate with the database. It is what a user see and can read. When a user types for example www.bbc.com then the browser sends the http request through internet to the web server and then the second send it to the application which transfer it into DBMS which next read the database data and send them back in the same order. The application executes the logic of the business whereas the DBMS is the software program that manage and make queries to the database . And finally the database is the place where all the files are. Here it is important to be said that the application is what a programmer of an enterprise make whereas the DBMS is what an enterprise buy, and both communicate each other in SQL.
Generally in software programming, this is known as Three-tier architecture. It refers to the client–server software architecture pattern in which the user interface (presentation), functional process logic (“business rules”), computer data storage and data access are developed and maintained as independent modules, most often on separate platforms. So either you are coding in C or Java and either we are talking about mobile or computer interface and either you are using MySQL dbms or Mongo dbms this remain the same.
4tier

This was maybe a long introduction about databases but actually this is just the early beginning for getting familiar with how a database works. If you like the idea of dealing with databases or your teacher wants you to design a database you should read this article as a second step in this journey.


Kommentare: