
Data Cleaning and Exploration with Machine Learning
ISBN: 9789365892192
eISBN: 9789365891102
Authors: Tirupathi Rao Dockara
Rights: Worldwide
Edition: 2026
Pages: 432
Dimension: 8.5*11 Inches
Book Type: Paperback
DESCRIPTION
Machine learning has become central to how organizations handle data in today’s world. With businesses generating vast amounts of information, the ability to clean, explore, and model data effectively is no longer optional, it is a critical skill for decision-making, innovation, and competitive advantage.
This book takes readers on a structured journey, starting with Python foundations and essential libraries. It discusses data cleaning, preprocessing, and exploratory analysis, and then explores text and time series data, dimensionality reduction, regression, classification, and clustering techniques. Advanced topics such as model evaluation, neural networks, deep learning, retrieval-augmented generation, and explainable AI are covered in detail, which are supported by real-world examples and case studies. Each chapter builds progressively, ensuring both theoretical grounding and practical application, and vital industry practices.
By the end of the book, readers will be equipped with the skills to handle raw datasets, uncover patterns, build and evaluate ML models, and apply advanced techniques responsibly. You will be confident in applying these methods to solve problems in their domains, making yourself a competent data practitioner, ready to deliver insights and drive impact.
WHAT YOU WILL LEARN
● Understand Python foundations and essential data science libraries.
● Apply data cleaning methods to handle missing or noisy data.
● Perform exploratory data analysis using statistics and visualization.
● Work with text, time-series, and high-dimensional datasets.
● Build regression, classification, and clustering ML models.
● Evaluate models with metrics, validation, and hyperparameter tuning.
● Explore neural networks, deep learning, and explainable AI techniques.
● Implement real-world case studies and capstone data projects.
WHO THIS BOOK IS FOR
This book is for data analysts, data scientists, ML engineers, and business professionals who want to strengthen their skills in data preparation and modeling. It is also valuable for students, researchers, and software developers aiming to apply ML techniques effectively in real-world projects.
TABLE OF CONTENTS
1. Introduction to Data Science and Machine Learning
2. Setting Up Your Development Environment
3. Introduction to Integrated Development Environments
4. Exploring Essential Python Libraries
5. Introduction to Data Cleaning
6. Exploratory Data Analysis Made Easy
7. Demystifying Data Preprocessing from Raw to Refined
8. Unraveling Insights from Text and Time Series Data
9. Dimensionality Reduction Techniques
10. Building Regression Models for Confident Predictions
11. Supervised Learning for Developing Classification Models
12. Discovering Hidden Patterns with Clustering Techniques
13. Ensuring Model Reliability Through Evaluation
14. Techniques and Applications of RAG Pipelines
15. Fine-tuning and Evaluating Base LLMs
16. Putting It All Together with Case Studies
17. Best Practices and Tips from Industry Experts
18. Conclusion and Further Resources
ABOUT THE AUTHORS
Tirupathi Rao Dockara is a technology leader with more than twenty-five years of experience working on digital platforms, data systems, and AI solutions. He is the Chief Architect and Vice President at Centific, where he leads the digital architecture and cognitive command (DAC) initiative. His work focuses on building architectures, driving AI adoption, and developing platforms that help organizations use data more effectively and responsibly.
In addition to his leadership role in global enterprises, Tirupathi has also been an entrepreneur. He founded Astute Capitals, where he developed the CORRECT Framework for financial reconciliation, anti–money laundering, escrow, and regulatory reporting. The framework has been widely recognized for its effectiveness and marked an important milestone in his journey of innovation.
Over the years, he has led transformation programs across industries, including retail, banking, manufacturing, and technology services. His focus has always been on designing systems that are scalable, secure, and intelligent. More recently, he has been active in the field of generative AI, offering practical advice on when to use fine-tuning, retrieval-augmented generation, or hybrid approaches.
Tirupathi is also a familiar face on stage as a panelist and speaker. He has represented Centific events such as the AI World Summits in Delhi and Bangalore, GAISA, HYSEA, JITO Connect, the AI World Power 30 Summit—Asia, and the PMI Pearl City Chapter. He also enjoys speaking at universities across India, sharing his perspective on AI and innovation with students and faculty.
Recognized for his leadership and product innovations, Tirupathi continues to mentor teams and professionals while bridging the gap between industry and academia. His work consistently combines strong digital architecture with the intelligence of AI to create a meaningful impact.
In this book, he brings together his professional experience, research, and entrepreneurial journey to share practical insights for readers who want to understand and apply data and AI in today’s technology-driven world.
Education
● Ph.D. in sustainable AI and data governance (Research papers in publication process for consideration) - Chandigarh University
● M. Tech in data sciences (Specialization: machine learning, big data analytics, cloud computing) - BITS Pilani
● Master’s in computer applications - Osmania University, Hyderabad
● Bachelor’s in mathematics - Osmania University, Hyderabad

Choose options

