Dataprep Fundamentals Training Course
Dataprep is an intelligent data service that simplifies the visual exploration, cleansing, and organization of both structured and unstructured data, making it ready for analysis, reporting, and use in machine learning applications.
This instructor-led, live training (online or onsite) is designed for beginner to intermediate-level IT professionals who want to acquire the knowledge and practical skills needed to effectively prepare data for analysis, ensuring accuracy, consistency, and reliability across various datasets.
By the end of this training, participants will be able to:
- Understand the importance of data preparation in ensuring high-quality, reliable data for analysis and modeling purposes.
- Gain hands-on experience in data collection, cleaning, transformation, and integration techniques using real-world datasets.
- Develop the ability to identify and effectively address data-related challenges, discrepancies, and inconsistencies.
Format of the Course
- Interactive lectures and discussions.
- Plenty of exercises and practice sessions.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Course Outline
Introduction
- Understanding the importance of data preparation in analytics and machine learning
- Data preparation pipeline and its role in the data lifecycle
- Exploring common challenges in raw data and the impact on analysis
Data Collection and Acquisition
- Sources of data: databases, APIs, spreadsheets, text files, and more
- Techniques for collecting data and ensuring data quality during collection
- Collecting data from various sources
Data Cleaning Techniques
- Identifying and handling missing values, outliers, and inconsistencies
- Dealing with duplicates and errors in the dataset
- Cleaning real-world datasets
Data Transformation and Standardization
- Data normalization and standardization techniques
- Categorical data handling: encoding, binning, and feature engineering
- Transforming raw data into usable formats
Data Integration and Aggregation
- Merging and combining datasets from different sources
- Resolving data conflicts and aligning data types
- Techniques for data aggregation and consolidation
Data Quality Assurance
- Methods for ensuring data quality and integrity throughout the process
- Implementing quality checks and validation procedures
- Case studies and practical applications of data quality assurance
Dimensionality Reduction and Feature Selection
- Understanding the need for dimensionality reduction
- Techniques like PCA, feature selection, and reduction strategies
- Implementing dimensionality reduction techniques
Summary and Next Steps
Requirements
- Basic understanding of data concepts
Audience
- Data analysts
- Database administrators
- IT professionals
Need help picking the right course?
Dataprep Fundamentals Training Course - Enquiry
Dataprep Fundamentals - Consultancy Enquiry
Testimonials (2)
It's a hands-on session.
Vorraluck Sarechuer - Total Access Communication Public Company Limited (dtac)
Course - Talend Open Studio for ESB
I generally enjoyed the knowledge of the trainer.
Eddyfi Technologies
Course - GDPR Workshop
Related Courses
EBX5 for Developers
21 HoursThis instructor-led, live training in Uzbekistan (online or onsite) is aimed at developers who wish to use EBX5 (TIBCO EBX) to enable a Master Data Management solution within their organization.
By the end of this training, participants will be able to:
- Interpret requirements and architect an MDM solution.
- Enable the management and integration of master data.
- Integrate and transfer data across multiple systems.
- Import data into EBX5 using match and merge logic.
- Design, create and document a data model that addresses their organization's business requirements.
- Integrate EBX5 with 3rd party services.
GDPR Workshop
7 HoursThis one-day course is designed for individuals seeking a concise overview of the GDPR — the General Data Protection Regulation, which came into effect on May 25, 2018. It is particularly suitable for managers, department heads, and employees who need to grasp the fundamental aspects of the GDPR.
How to Audit GDPR Compliance
14 HoursThis course is primarily designed for auditors and other administrative roles responsible for ensuring that their control systems and IT environments comply with current laws and regulations. The course will start by providing a clear understanding of key GDPR concepts and how these will impact the work of auditors. Participants will delve into the rights of data subjects, the obligations of data controllers and processors, and the enforcement and compliance mechanisms outlined in the Regulation. Additionally, the training will cover ISACA's audit program, which equips auditors to evaluate GDPR governance and response frameworks, as well as supporting processes that can help mitigate the risks associated with non-compliance.
GDPR Advanced
21 HoursThis course provides a comprehensive understanding of the GDPR and is designed for individuals who work extensively with it, particularly those who may be part of the GDPR team. It is especially suitable for IT, human resources, and marketing professionals who will need to handle GDPR-related tasks frequently.
Oracle GoldenGate
14 HoursThis instructor-led, live training in Uzbekistan (online or onsite) is aimed at sysadmins and developers who wish to set up, deploy, and manage Oracle GoldenGate for data transformation.
By the end of this training, participants will be able to:
- Install and configure Oracle GoldenGate.
- Understand Oracle databases replication using the Oracle GoldenGate tool.
- Understand the Oracle GoldenGate architecture.
- Configure and perform a database replication and migration.
- Optimize Oracle GoldenGate performance and troubleshoot issues.
PECB GDPR - Certified Data Protection Officer
35 HoursThe PECB Certified Data Protection Officer training course equips you with the essential knowledge and skills needed to effectively carry out the role of a data protection officer in the implementation of a GDPR compliance program.
Why should you attend?
As data protection becomes increasingly valuable, the need for organizations to safeguard this information is also growing. Non-compliance with data protection regulations not only violates individuals' fundamental rights and freedoms but can also lead to risky situations that may damage an organization’s credibility, reputation, and financial standing. This is where your skills as a data protection officer come into play.
The PECB Certified Data Protection Officer training course will help you gain the knowledge and skills required to serve as a Data Protection Officer (DPO) and assist organizations in ensuring compliance with the General Data Protection Regulation (GDPR).
Through practical exercises, you will master the role of the DPO and become proficient in informing, advising, and monitoring GDPR compliance, as well as collaborating with supervisory authorities.
After completing the training course, you can take the exam. If you pass successfully, you can apply for the “PECB Certified Data Protection Officer” credential. The internationally recognized “PECB Certified Data Protection Officer” certificate will demonstrate your professional capabilities and practical knowledge in advising controllers and processors on how to meet their GDPR obligations.
Who should attend?
- Managers or consultants looking to prepare and support an organization in planning, implementing, and maintaining a compliance program based on the GDPR
- DPOs and individuals responsible for ensuring conformance with the GDPR requirements
- Members of information security, incident management, and business continuity teams
- Technical and compliance experts preparing for a data protection officer role
- Expert advisors involved in personal data security
Learning objectives
- Understand the concepts of the GDPR and interpret its requirements
- Comprehend the content and the correlation between the General Data Protection Regulation and other regulatory frameworks and applicable standards, such as ISO/IEC 27701 and ISO/IEC 29134
- Acquire the competence to perform the role and daily tasks of the data protection officer in an organization
- Develop the ability to inform, advise, and monitor compliance with the GDPR and collaborate with the supervisory authority
Personal Data Protection Officer - Basic Level
21 HoursPurpose of the Training
- Acquainting the audience with systematized, comprehensive issues of the functioning of personal data protection on the basis of Polish and European law
- Providing practical knowledge about the new rules for the processing of personal data
- Presentation of the areas of the greatest legal risks in connection with the entry into force of the GDPR
- Practical preparation for independent performance of the duties of a Personal Data Protection Officer
Personal Data Protection Officer - Advanced Level
14 HoursPurpose of the Training
- Gaining practical knowledge on how to perform the tasks of the Inspector
- Gaining practical knowledge of how to audit and how to assess risk
- Providing practical knowledge about the new rules for the processing of personal data
Microsoft Purview: Data Governance and Compliance
14 HoursThis instructor-led, live training in Uzbekistan (online or onsite) is aimed at beginner-level, intermediate-level, and advanced-level data professionals who wish to use Microsoft Purview to enhance their data governance and compliance capabilities.
By the end of this training, participants will be able to:
- Install and configure Microsoft Purview.
- Implement data governance and compliance policies.
- Utilize data discovery and classification features.
- Monitor and manage data compliance.
Sensor Fusion Algorithms
14 HoursSensor Fusion involves the combination and integration of data from various sensors to offer a more accurate, reliable, and contextually rich view of the information.
The implementation of Sensor Fusion requires algorithms that can effectively filter and integrate data from different sources.
Audience
This course is designed for engineers, programmers, and architects who work with multi-sensor systems.
Talend Administration Center (TAC)
14 HoursThis instructor-led, live training in Uzbekistan (online or onsite) is aimed at system administrators, data scientists, and business analysts who wish to set up Talend Administration Center to deploy and manage the organization's roles and tasks.
By the end of this training, participants will be able to:
- Install and configure Talend Administration Center.
- Understand and implement Talend management fundamentals.
- Build, deploy, and run business projects or tasks in Talend.
- Monitor the security of datasets and develop business routines based on the TAC framework.
- Obtain a broader comprehension of big data applications.
Talend Big Data Integration
28 HoursThis instructor-led, live training in Uzbekistan (online or onsite) is aimed at technical persons who wish to deploy Talend Open Studio for Big Data to simplifying the process of reading and crunching through Big Data.
By the end of this training, participants will be able to:
- Install and configure Talend Open Studio for Big Data.
- Connect with Big Data systems such as Cloudera, HortonWorks, MapR, Amazon EMR and Apache.
- Understand and set up Open Studio's big data components and connectors.
- Configure parameters to automatically generate MapReduce code.
- Use Open Studio's drag-and-drop interface to run Hadoop jobs.
- Prototype big data pipelines.
- Automate big data integration projects.
Talend Cloud
7 HoursThis instructor-led, live training in Uzbekistan (online or onsite) is aimed at data administrators and developers who wish to manage, monitor, and operate data integration processes using Talend Cloud services.
By the end of this training, participants will be able to:
- Navigate the Talend Management Console to manage users and roles in the platform.
- Evaluate data to find and understand relevant datasets.
- Create a pipeline to process and monitor data at rest or in action.
- Prepare data for analysis to generate insights relevant to the business.
Talend Data Stewardship
14 HoursThis instructor-led, live training in Uzbekistan (online or onsite) is aimed at beginner to intermediate-level data analysts who wish to deepen their understanding and skills in managing and improving data quality using Talend Data Stewardship.
By the end of this training, participants will be able to:
- Gain a comprehensive understanding of the role of data stewardship in maintaining data quality.
- Use Talend Data Stewardship for managing data quality tasks.
- Create, assign, and manage tasks within Talend Data Stewardship, including workflow customization.
- Use the tool's reporting and monitoring capabilities to track data quality and stewardship efforts.
Talend Open Studio for ESB
21 HoursIn this instructor-led, live training in Uzbekistan, participants will learn how to use Talend Open Studio for ESB to create, connect, mediate and manage services and their interactions.
By the end of this training, participants will be able to
- Integrate, enhance and deliver ESB technologies as single packages in a variety of deployment environments.
- Understand and utilize Talend Open Studio's most used components.
- Integrate any application, database, API, or Web services.
- Seamlessly integrate heterogeneous systems and applications.
- Embed existing Java code libraries to extend projects.
- Leverage community components and code to extend projects.
- Rapidly integrate systems, applications and data sources within a drag-and-drop Eclipse environment.
- Reduce development time and maintenance costs by generating optimized, reusable code.