下载代码 https://github.com/tangjianpku/LINE
代码中需要修改的地方:
reconstruct.cpp 142行 改为:printf("Number of vertices: %d \n", num_vertices); reconstruct.cpp 315行 改为:if ((i = ArgPos((char *)"-k-max", argc, argv)) > 0) max_k = atoi(argv[i + 1])在Microsoft Visual Studio中新建四个项目:line、reconstruct、normalize、concatenate,将四个代码文件放入对应的项目中,编译得到.exe文件。在编译line项目时,需要用到boost包
安装教程:https://www.jianshu.com/p/007c0e9a863c
1.通过reconstruct程序对原网络进行重建(1h)
reconstruct.exe -train net_youtube.txt -output net_youtube_dense.txt -depth 2 -k-max 10002.两次运行line,分别得到一阶相似度和二阶相似度下的embedding结果
line.exe -train net_youtube_dense.txt -output vec_1st_wo_norm.txt -binary 1 -size 128 -order 1 -negative 5 -samples 10000 -threads 40 line.exe -train net_youtube_dense.txt -output vec_2nd_wo_norm.txt -binary 1 -size 128 -order 2 -negative 5 -samples 10000 -threads 403.利用normalize程序将实验结果进行归一化
normalize.exe -input vec_1st_wo_norm.txt -output vec_1st.txt -binary 1 normalize.exe -input vec_2nd_wo_norm.txt -output vec_2nd.txt -binary 14.使用concatenate程序连接一阶嵌入和二阶嵌入的结果 concatenate.exe -input1 vec_1st.txt -input2 vec_2nd.txt -output vec_all.txt -binary 1