Statistical Methods of Data Analysis

Major: Software Engineering
Code of subject: 6.121.03.E.057
Credits: 6.00
Department: Software
Lecturer: Vasyl Havrysh
Semester: 5 семестр
Mode of study: денна
Мета вивчення дисципліни: The main goal of studying the academic discipline is the formation of students' basic mathematical knowledge for solving problems in professional activities, analytical thinking skills and mathematical formulation of technical problems. Statistical methods of data analysis are an important discipline of specialty 121 "Software engineering". It is taught with the aim of teaching students: to investigate statistical discrete and interval series of experimental data, to determine their numerical parameters and to perform a geometric representation; determine point and interval estimates of distribution parameters of statistical data samples; to investigate the influence of various factors on the results of measurement of experimental data; determine the correlation between jointly measured quantities using experimental data; establish the form of this relationship using linear and non-linear regression relationships.
Завдання: As a result of studying the discipline, the student should be able to demonstrate the following learning outcomes: 1) analyze discrete and interval statistical series of experimental data; 2) determine their main numerical parameters; 3) perform their geometric representation; 4) investigate the influence of factors on the results of experiments and measurements using variance analysis; 5) investigate the correlation between jointly measured quantities using experimental data; 6) establish the form of this relationship using regression analysis methods; 7) be able to use Mathematica and Statistics packages to solve practical problems. The study of an academic discipline involves the formation and development of students' competencies: general: INT: the ability to solve complex specialized tasks or practical problems of software engineering, characterized by complexity and uncertainty of conditions, using theories and methods of information technologies; professional: FCS3.1: the ability to demonstrate knowledge to investigate the influence of factors on the data of the results of experiments using variance analysis, establishing the relationship between data using correlation and regression analysis. The learning outcomes of this discipline detail the following program learning outcomes: 1) the ability to demonstrate knowledge and understanding of scientific and mathematical principles underlying information technologies; 2) the ability to demonstrate knowledge and skills in data collection, analysis, processing and modeling in the subject area.
Learning outcomes: As a result of studying the academic discipline, the student must be able to demonstrate the following learning outcomes: mastery of fundamental concepts and their main properties and practical skills of use. As a result of studying the academic discipline, the student must be able to demonstrate the following program learning outcomes: PR01. Know and apply in practice the principles and methods of data storage, extraction and processing. PR06. To be able to use statistical methods for determination connection of input and output parameters, analysis of process parameters of different nature, establishment of mutual dependence between various factors and process results.
Required prior and related subjects: Previous disciplines: probability theory and mathematical statistics; related and subsequent disciplines: artificial intelligence technologies in data engineering.
Summary of the subject: The educational discipline "Statistical methods of data analysis" consists of sections: "Statistical discrete and interval data series of the results of measurements and experiments, their geometric representation and determination of the main numerical parameters", "Basic theoretical laws of the distribution of random variables", "Point and interval estimates of distribution parameters random variables", "Statistical hypotheses", "Fundamentals of variance analysis", "Elements of correlation analysis", "Regression analysis".
Опис: General characteristics of statistical methods of data processing of the results of measurements and experiments and their classification. 2. Discrete statistical series of observations, distribution function and their graphic representation. 3. Basic numerical parameters of statistical discrete series. 4. Interval statistical series. 5. Basic distributions of random variables: binomial, uniform, normal, exponential, Poisson, Pearson (chi-square distribution), Weibull, “Studenta” (t-distribution), Fisher-Snedecor. 6. Point and interval estimates of statistical data distribution parameters. Methods of determining point estimates. Interval estimates for normally distributed statistics. 7. Testing of statistical hypotheses. Hypothesis about the distribution of statistical data. Pearson, Kolmogorov and Smirnov criteria. 8. Basics of dispersion analysis. One-factor and two-factor variance analysis. 9. Basics of correlation analysis. Correlation coefficient, V. Romanovsky's criterion and R. Fisher's function. 10. Basics of regression analysis. Linear regression. Regression coefficients and methods of their determination.
Assessment methods and criteria: Current control: performance and defense of laboratory work, performance of practical tasks, frontal and selective oral examination, evaluation of the activity of submitted proposals, original solutions, clarifications and definitions. Examination control: written and oral survey, test control.
Критерії оцінювання результатів навчання: The study discipline ends with a semester control, the form of which is provided by the curriculum with a semester assessment. The semester grade consists of the sum of points provided for current and exam control. The teacher proves this information to the students at the first lesson on the academic discipline. 1. Points for current control are assigned before the beginning of the session. Students who have completed 100% of the work of the current control are admitted to the exam. A student who completed less than 50% of the work of the current control is considered uncertified and has the opportunity to re-study the discipline. A student who completed more than 50% of the work but not all 100% can complete the task and pass the exam at the commission. 2. Points for practical classes are awarded according to a written survey and general activity in the class. 3. Points for laboratory work are assigned according to successful defense. The defense is considered successful if the student demonstrated the performance of laboratory work on time in accordance with his version of the task, correctly prepared the report and defended it, and gave correct answers to oral questions; was able to make corrections in the laboratory at the teacher's request. If the defense of the laboratory work is delayed, the points for the laboratory work are reduced by 1 for each week of delay in the defense. 4. Responsibility for non-compliance with the principles of academic integrity during the performance and defense of laboratory work: if during the defense of the laboratory work, the teacher revealed signs of violation of academic integrity, the work is not counted, the student receives a new version of the task and can defend the laboratory work again for a minimum number of points (1 point).
Порядок та критерії виставляння балів та оцінок: The total number of points (100) consists of the sum of the points received for the current performance (40) and for the control task (60). The current success rate includes points for successfully completing laboratory work (30) and points obtained in practical classes (10).
Recommended books: 1. Slyusarchuk Yu. M. Probability theory, mathematical statistics and probabilistic processes: a study guide / Yu. M. Slyusarchuk, Y. Ya. Khrom'yak, L. L. Javala, V. M. Tsymbal – Lviv : Lviv Polytechnic Publishing House, 2015 . – 364 p. 2. Bakhrushin V. E. Methods of data analysis: a study guide for students / V. E. Bakhrushin – Zaporizhzhia: KPU, 2011. – 268 p. ISBN 978-966-414-103-8. 3. V. V. Barkovskyi. Probability theory and mathematical statistics: teaching. manual / V. V. Barkovskyi, N. V. Barkovskaya, O. K. Lopatin. – 5th edition. - K.: Center of Educational Literature, 2010. - 424 p. 4. Yeleiko Ya. I.. The theory of probabilities. Theorems, examples and problems: teaching method. manual / Ya. I. Yeleyko, B. I. Kopytko, B. M. Trish. - Lviv. Publisher Ivan Franko National University Center, 2009. – 260 p. 5. B.L. van der Waerden. Mathematical Statistics. – London: George Allen & Unwin Ltd. : Springer-Verlag Berlin Heidelberg GmbH; Softcover reprint of the original 1st ed. 1969 edition, 1960. – 382 p. 6. Electronic educational and methodical complex "Statistical methods of data analysis": Location address: http: // vns.lpnu.edu.ua/course / view. php?id=4767. 7. Principal Components and Factor Analysis [Electronic resource]. – – Access mode : http://www.fmi.uni-sofia.bg/fmi/statist/education/textbook/eng/stfacan.html. 8. Factor Analysis: Statistical Methods and Practical Issues (Quantitative Applications in the Social Sciences) [Electronic resource]. – Access mode : https://www.amazon.com/gp/product/0803911661/ref=pd_sim_14_2?ie=UTF8&pd_rd_i=0803911661&pd_rd_r=G 2FAYKQMVC1BG7863C1D&pd_rd_w=bnkFs&pd_rd_wg=vfAyn&psc=1&refRID=G2FAYKQMVC1BG7863C 1D.