Artifacts, Data Dictionary, Data Modeling, Data Wrangling
1. Artifacts | Data Dictionary |
Data Modeling | Data Wrangling
Md Faisal Akbar
An artifact is one of many kinds of tangible by-products produced during the
development of software.
Some artifacts (e.g., use cases, class diagrams, and other Unified Modeling
Language (UML) models, requirements and design documents) help describe
the function, architecture, and design of software.
Other artifacts are concerned with the process of development itself—
such as project plans, business cases, and risk assessments.
Artifacts are typically living documents and formally updated to reflect
changes in scope. They exist so that everyone involved in the project has a
shared understanding of all information related to the effort.
4. Whatis a data dictionary?
◇ It is an integralpart of a database.
◇ It holds information about the
database and the data that it stores.
◇ A data dictionary is a “virtual database”
containing metadata (data about data).
5. META DATA
Metadata is Metadata is defined as data providing
information about one or more aspects of the
data, such as:
◇ Time and date of creation.
◇ Authorization of the data.
◇ Attribute size.
◇ Purpose of the data.
6. It is where the systems analyst goes to define or look
up information about entities, attributes and relationships
on the ERD (Entity Relationship Design).
7. Viewing the data dictionary
SELECT * FROM DICT;
SELECT * FROM DICTIONARY;
lists all tables and views of the data dictionary that are accessible to the
user. The selected information includes the name and a short description of
each table and view
8. Data Dictionary provides information about
Relationship to other variables
Precision of data
11. Structure of Data Dictionary
systems all have
some form of
It can be
the DBMS or
changes in the
12. Disadvantages of
Creating a new data dictionary is
a very big task. It will take years
To create one.
Requires management commitment,
which is not easy to achieve,
particularly where the benefits are
intangible and long term.
The cost of data dictionary will
be bit high as it includes its initial
build and hardware charges as
well as cost of maintenance.
It needs careful planning,
defining the exact requirements
designing its contents, testing,
13. What is a Data Model ?
Graphical Representation of tables
Represent relationship between
Phases of Data Model
14. Conceptual Data Model
Only “Entities” visible
No attribute is specified.
No primary key is specified.
15. Logical Data Model Includes all entities and relationships
The primary key for each entity is specified.
Foreign keys are specified
Normalization occurs at this level.
User Friendly Attribute name
More detailed than Conceptual Model
The steps for designing the logical data model
are as follows:
1. Specify primary keys for all entities.
2. Find the relationships between different
3. Find all attributes for each entity.
4. Resolve many-to-many relationships.
16. Physical Data Model
Physical data model represents how the model will be
built in the database
Entities referred to as Tables
Attribute referred to as Columns
Foreign keys are used to identify relationships
Denormalization may occur based on user
Database compatible Table names
Database compatible Column names
Database specific data types (For example, data
type for a column may be different between MySQL
and SQL Server)
The steps for physical data model design are
1. Convert entities into tables.
2. Convert relationships into foreign keys.
3. Convert attributes into columns.
4. Modify the physical data model based on physical
constraints / requirements.
17. Compare Stages of Data Model
Feature Conceptual Logical Physical
Entity Names ✓ ✓
Entity Relationships ✓ ✓
Primary Keys ✓ ✓
Foreign Keys ✓ ✓
Table Names ✓
Column Names ✓
Column Data Types ✓
18. Data wrangling
Data wrangling is the process of cleaning, structuring and enriching raw data into a desired format
for better decision making in less time.
Key Steps of Data Wrangling:
Data Acquisition: Identify and obtain access to the data within your sources
Joining Data : Combine the edited data for further use and analysis
Data Cleansing: Redesign the data into a usable/functional format and correct/remove any bad