Tuesday, May 31, 2016

Entity Relationship Modeling and Diagraming

This week, we're learning how to design entity relationship diagrams to graphically depict various components of entities, attributes and their relationships.  Hopefully, it will enable me to better implement database designs.  There's a lot to take in, so reviewing components previously studied in past modules is necessary.

Tuesday, May 24, 2016

Normalization

Normalization is an analytical process that yields good database design and table structure with minimal data redundancy, and therefore, less anomaly, predicated on the concepts of attribute determination and dependency.  The process works through stages of lower to higher normal forms, as well as Boyce-Codd Normal Form (BCNF).  (Sometimes, denormalization or less normalization is best, for some examples, when one needs to better reflect an organization's real operations or when end-user demand requires faster querying because more resources may be required or more relational join operations must be performed with higher forms.)  However, the first through third normal forms should almost always be implemented, along with the BCNF, when analyzing database design.  A table in the 1NF has labeled dependencies, identifiable primary keys and attributes and lacks repeating groups, where one piece of data represents a set of data, leaving nulls instead of an appropriate data value.  To be in the 2NF, a table must, in addition to the above 1NF requirements, be without partial dependencies, where one attribute is dependent on part of a composite primary key determinant.  To be 3NF compliant, in addition to the above, a table must lack transitive dependences, where a nonprime attribute is determinant of a nonprime dependent.  In both cases, the determinant is copied and made a primary key in its own table with the dependent, which is extracted from the table of origin.  Lastly, for discussion here, is the BCNF, which may only be violated when the table contains more than one candidate key, when a non-key attribute determines a prime key attribute.  It can be resolved by reassigning the primary keys.  A table is in BCNF when all determinants are candidate keys.

Tuesday, May 17, 2016

Advancing with SQL

Through this week's assignments, I've developed a greater understanding of the relationships and interactions between tables.  Creating foreign keys requires establishing those tables first, as well as setting up associated primary keys, and joining tables requires consideration be given to the variety of data queries one may make.  Null values in the place of common, connecting attributes, for example, requires outer joins from left or right dependent on where the otherwise excluded pieces occur.  Other times, a simple natural join is the quickest command, if one is further specifying certain conditions be met, which would make any excluded bits of data irrelevant.

We've used spool commands to capture our output, which is very helpful for debugging, because it's far easier to re-run a file than to re-type the code.  We've also separated our creation and modification files to help to this end, which is a best practice I'll apply in the future.

I had some debugging issues with parent integrity, due to a mismatched character data type between a primary and foreign key.

Otherwise, I'm reminded that double- and triple-checking data is still sometimes not enough to catch every typo, certainly as the database grows, which is frustrating.  Sometimes, reading code backwards (right-to-left) to view data from a new perspective or taking a break and returning to review insertion commands later, can help.

It's nice to see Database Administrator jobs posted -- 189 today in SF alone on indeed.com -- and to be learning some of the skills required in the specs.  I also understand that software developers, engineers and analysts alike all use SQL in their work, more or less.

Tuesday, May 10, 2016

SQL

This week, it's been exciting to have been studying SQL (Structured Query Language) commands for creating databases and relational table structures, performing various types of data manipulations and administration and querying the database to retrieve useful information!  These include commands and options such as INSERT, SELECT (FROM, WHERE, GROUP BY, HAVING, ORDER BY), UPDATE, DELETE, COMMIT and ROLLBACK, along with comparison or logical operators and aggregate functions.  Interestingly, SQL is a relatively simple language to learn, comprising a vocabulary of fewer that 100 words.  We've also studied more on relational databases and some basic relational algebra and its operators: UNION, INTERSECT, DIFFERENCE, PRODUCT, SELECT, PROJECT, and JOIN.  I've realized, contrary to the musings of last week's blog, that the SQL 'dialects' (such as Oracle, MySQL) are rather similar, so that in studying and learning one, I can more easily work with any of them.   Furthermore, according to Coronel et al., Oracle was number one in the RDBMS sector in 2010....  Anyway, the industry is always changing, and it'll be interesting to see what happens to DBMS as unstructured data, Big Data, and with it NoSQL, become more prevalent.  Still, it seems there will long, if not always, be a need for structured database models for traditional information, of which the relational one is tops.

Tuesday, May 3, 2016

Relational Database Management Systems

According to Coronel, Morris and Rob, databases are shared and integrated computer structures that store a collection of end-user data and metadata in a methodical way that is much more dynamic, malleable and synchronized than the traditional file system.  Databases avoid some flaws of the file system, for example, it's possible to add single bits of unconnected data without the requirement of adding filler data to a ledger's expansive columns and, secondly, deletion/edit anomalies no longer occur.  It's easy to understand how databases could be valuable when a company has trillions of pieces of raw data to manage, process and store.  Also, databases allow for quick transformations of raw data into graphical or tabular presentations of more meaningful information, which improve business decision-making.

I have no prior education nor experience in database modeling, i.e., representing and storing data, nor management software, which is used to create/generate, manage, store and retrieve/query the interrelated data by the single or multi-user end user(s), so I'm looking forward to learning it, other database concepts and statement writing, and, in particular, MySQL.  According to Coronel et al., databases result in improved data sharing, security, integration, productivity and access, as well as minimized inconsistency.  Single users have desktop databases and multi-users either workspace or enterprise databases.  Also, if the data is located at a single site, it's centralized, in contrast to distributed.

General purpose databases are varied and discipline specific ones will be more specific to its related discipline, such as medical of financial records.  Alternatively, operational databases may focus on something like a transaction occurring day-to-day.  Analytical databases focus on storing metrics used for tactical decision making, having "massaged" or manipulated raw data to extract valuable information.  This data is made structured by the processing of raw data to business intelligence.  Most databases mentioned above will use at least semi-structured data, perhaps that made textual by XML.

The specific database model we will study is the predominant relational one, which has the logical structuring of related tables, connected by super, primary, candidate, foreign and other keys, central to its system.

After listening to class orientation, my understanding is that we'll largely be studying relational database models and primarily using Oracle, in particular, with a focus on foundational theory and design, so I'll be interested to compare the two software and study implementing the former, i.e., MySQL, on my own, if necessary, given its ubiquity in the workplace.  (I'm hopeful the divide between theory and practice won't be too great, in this respect, as is sometimes the case in academia, and that I'll develop not only a foundation here but a working, transferable knowledge that will be applicable in a professional capacity such as database developer, designer, administrator, architect, consultant, etc..)  My aim is to gain a working proficiency that includes implementation of these, and I'm looking forward to this class!   Still, I appreciate the value of learning strong design fundamentals, which is crucial.

I had a slight delay getting started this week, as I went to the the south bay to collect a PC from my sister, who's agreed to let me borrow it for the length of the class.  There were no issues installing Oracle, which I left overnight and returned to in the morning.

Reference

Coronel, Carlos, Steven Morris, Peter Rob, and Carlos Coronel. Database Principles: Fundamentals of Design, Implementation, and Management. Vol. 10. Mason, OH: South-Western, 2013. Print.