DNP 805 Topic 4 Discussion Question Two
DNP 805 Topic 4 Discussion Question Two
In the prior discussion question in this topic, you selected a defined patient population and listed elements that you think will be valuable in a database. Of those elements you identified to be valuable in a database, which are structured and unstructured? Explain.
What is Structured Data?
The term structured data refers to data that resides in a fixed field within a file or record. Structured data is typically stored in a relational database (RDBMS). It can consist of numbers and text, and sourcing can happen automatically or manually, as long as it’s within an RDBMS structure. It depends on the creation of a data model, defining what types of data to include and how to store and process it.
The programming language used for structured data is SQL (Structured Query Language). Developed by IBM in the 1970s, SQL
handles relational databases. Typical examples of structured data are names, addresses, credit card numbers, geolocation, and so on.
What is Unstructured Data?
Unstructured data is more or less all the data that is not structured. Even though unstructured data may have a native, internal structure, it’s not structured in a predefined way. There is no data model; the data is stored in its native format.
Typical examples of unstructured data are rich media, text, social media activity, surveillance imagery, and so on.
The amount of unstructured data is much larger than that of structured data. Unstructured data makes up a whopping 80% or more of all enterprise data, and the percentage keeps growing. This means that companies not taking unstructured data into account are missing out on a lot of valuable business intelligence.
Structured vs Unstructured Data: 5 Key Differences
1) Defined vs Undefined Data
Structured data is clearly defined types of data in a structure, while unstructured data is usually stored in its native format. Structured data lives in rows and columns and it can be mapped into pre-defined fields. Unlike structured data, which is organized and easy to access in relational databases, unstructured data does not have a predefined data model.
2) Qualitative vs Quantitative Data
Structured data is often quantitative data, meaning it usually consists of hard numbers or things that can be counted. Methods for analysis include regression (to predict relationships between variables); classification (to estimate probability); and clustering of data (based on different attributes).
Unstructured data, on the other hand, is often categorized as qualitative data, and cannot be processed and analyzed using conventional tools and methods. In a business context, qualitative data can, for example, come from customer surveys, interviews, and social media interactions. Extracting insights from qualitative data requires advanced analytics techniques like data mining and data stacking.
3) Storage in Data Houses vs Data Lakes
Structured data is often stored in data warehouses, while unstructured data is stored in data lakes. A data warehouse is the endpoint for the data’s journey through an ETL pipeline. A data lake, on the other hand, is a sort of almost limitless repository where data is stored in its original format or after undergoing a basic “cleaning” process.
Both have the potential for cloud-use. Structured data requires less storage space, while unstructured data requires more. For example, even a tiny image takes up more space than many pages of text.
4) Ease of Analysis
One of the most significant differences between structured and unstructured data is how well it lends itself to analysis. Structured data is easy to search, both for humans and for algorithms. Unstructured data, on the other hand, is intrinsically more difficult to search and requires processing to become understandable. It’s challenging to deconstruct since it lacks a predefined data model and hence doesn’t fit in in relational databases.
While there are a wide array of sophisticated analytics tools for structured data, most analytics tools for mining and arranging unstructured data are still in the developing phase. The lack of predefined structure makes data mining tricky, and developing best practices on how to handle data sources like rich media, blogs, social media data, and customer communication is a challenge.
5) Predefined Format vs Variety of Formats
The most common format for structured data is text and numbers. Structured data has been defined beforehand in a data model.
Unstructured data, on the other hand, comes in a variety of shapes and sizes. It can consist of everything from audio, video, and imagery to email and sensor data. There is no data model for the unstructured data; it is stored natively or in a data lake that doesn’t require any transformation.