Research and Application on Spark Clustering Algorithm in Campus Big Data Analysis

Qing Hou (Nanjing Xiao Zhuang University, Jiangsu, Nanjing, 210017, China)
Guangjian Wang (Nanjing Xiao Zhuang University, Jiangsu, Nanjing, 210017, China)
Xiaozheng Wang (Nanjing Xiao Zhuang University, Jiangsu, Nanjing, 210017, China)
Jiaxi Xu (Nanjing Xiao Zhuang University, Jiangsu, Nanjing, 210017, China)
Yang Xin (Nanjing Xiao Zhuang University, Jiangsu, Nanjing, 210017, China)

Article ID: 1808


Big data analysis has penetrated into all fields of society and has brought about profound changes. However, there is relatively little research on big data supporting student management regarding college and university’s big data. Taking the student card information as the research sample, using spark big data mining technology and K-Means clustering algorithm, taking scholarship evaluation as an example, the big data is analyzed. Data includes analysis of students’ daily behavior from multiple dimensions, and it can prevent the unreasonable scholarship evaluation caused by unfair factors such as plagiarism, votes of teachers and students, etc. At the same time, students’ absenteeism, physical health and psychological status in advance can be predicted, which makes student management work more active, accurate and effective.


Spark; Clustering algorithm; Big data; Data analysis; Mllib

