Analysis of the evolution of scientific collaboration networks for the prediction of new co-authorships

Authors

Keywords:

Co-authorship networks, Scientific data repositories, Lattes Platform

Abstract

Doi: https://doi.org/10.1590/2318-0889202234e200033

When publishing an article with other authors, initial links must be formed by a collaboration between authors, a scientific collaboration network. In this context, the papers are represented by the edges, and the authors are represented the nodes, forming a network. At this moment, the following question arises: How does the evolution of the network occur over time? Understanding what factors are essential for creating a new connection to answer this question is necessary. Therefore, the purpose of this article is to foresee connections in co-authorship networks formed by PhDs with curricula registered in Lattes Platform in the areas of Information Sciences and Biology. The following steps are performed: initially the data is extracted and
organized. This step is essential for the continuity of the process. Then, co-authorship networks are generated based on articles published together. Subsequently, the attributes to be used are defined and some metrics are calculated. Finally, machine learning algorithms estimate future scientific collaborations in the selected areas. The Lattes Platform has 6.6 million resumes for researchers and represents one of the most relevant and recognized scientific repositories worldwide. As a result, random
forest and logistic regression algorithms showed the highest hit rates, and preferential attachment attribute was identified as the most influential in the emergence of new scientific collaborations. Through the results, it is possible to establish the evolution of the network of scientific associations of researchers at a national level, assisting development agencies in selecting of future
outstanding researchers.

Downloads

Download data is not yet available.

References

Acar, E. et al. Link prediction on evolving data using matrix and tensor factorizations. In: IEEE International Conference on Data Mining Workshops, 2009, Miami. Proceedings online […].Miami: IEEE Computer Society, 2009. p. 262-269. Doi: https://doi.org/10.1109/ICDMW.2009.5.

Adamic, L. A.; Adar, E. Friends and neighbors on the web. Social Networks, v. 25, n. 3, p. 211-230, 2003. Doi: https://doi.org/10.1016/S0378-8733(03)00009-1.

Ahmad, I. et al. Missing link prediction using common neighbor and centrality based parameterized algorithm. Scientific Reports, v. 10, n. 364, p. 1-10, 2020. Doi: https://doi.org/10.1038/s41598-019-57304-y.

Al Hassan, M.; Zaki, M. J. A survey of link prediction in social networks. In: Aggarwal C. (ed.). Social network data analytics. Boston: Springer, 2011. p. 243-275. Doi: https://doi.org/10.1007/978-1-4419-8462-3_9.

Barabási, A.; Albert, R. Emergence of scaling in random networks. Science, v. 286, n. 5439, p. 509-512, 1999. Doi: https://doi.org/10.1126/science.286.5439.509.

Buitinck, L. et al. API design for machine learning software: experiences from the scikit-learn project. ArXiv preprint arXiv:1309.0238, 2013. Available at: https://arxiv.org/pdf/1309.0238.pdf?source=post_elevate_sequence_page. Cited: May 10, 2020.

Dias, T. M. R. et al. Modelagem e caracterização de redes científicas: um estudo sobre a Plataforma Lattes. In: Brazilian Workshop on Social Network Analysis and Mining (BRASNAM), 2., 2013, Porto Alegre. Anais eletrônicos […]. Porto Alegre: Sociedade Brasileira de Computação, 2013. p. 116-121.

Available at: https://sol.sbc.org.br/index.php/brasnam/article/view/6851. Cited: May 10, 2020.

Dias, T. M. R. Um estudo da produção científica brasileira a partir de dados da Plataforma Lattes. 2016. 181 f. Tese (Doutorado em Modelagem Matemática e Computacional) – Centro Federal de Educação Tecnológica de Minas Gerais, Belo Horizonte, 2016.

Dias, T. M. R.; Moita, G. F. Um método para identificação de colaborações em grandes bases de dados científicos. Em Questão, v. 21, n. 2, p. 140-161, 2015. Doi: https://doi.org/10.19132/1808-5245212.140-161.

Digiampietri, L. et al. Um sistema de predição de relacionamentos em redes sociais. In: Simpósio Brasileiro de Sistemas de Informação (SBSI), 11., 2015, Goiânia. Anais eletrônicos […]. Goiânia: Sociedade Brasileira de Computação, 2015. p. 139-146. Doi: https://doi.org/10.5753/sbsi.2015.5810.

Hoffman, M.; Steinley, D.; Brusco, M. J. A note on using the adjusted Rand index for link prediction in networks. Social Networks, v. 42, p. 72-79, 2015. Doi: https://doi.org/10.1016/j.socnet.2015.03.002.

Kerrache, S.; Alharbi, R.; Benhidour, H. A Scalable Similaritypopularity Link prediction Method. Scientific Reports, v. 10, n. 1, p. 1-14, 2020. Doi: https://doi.org/10.1038/s41598-020-62636-1.

Krebs, V. E. Mapping networks of terrorist cells. Connections, v. 24, n. 3, p. 43-52, 2002. Available at: http://ecsocman.hse. ru/data/517/132/1231/mappingterroristnetworks.pdf. Cited: May 10, 2020.

Lane, J. Let’s make science metrics more scientific. Nature, v. 464, p. 488-489, 2010. Doi: https://doi.org/10.1038/464488a.

Liben-Nowell, D.; Kleinberg, J. The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology, v. 58, n. 7, p. 1019-1031, 2007. Doi: https://doi.org/10.1002/asi.20591.

Maruyama, W. T.; Digiampietri, L. A. Co-authorship prediction in academic social network. In: V Brazilian Workshop on Social Network Analysis and Mining (BRASNAM), 2019, Porto Alegre. Anais eletrônicos […]. Porto Alegre: Sociedade Brasileira de Computação, 2019. p. 61-72. Doi: https://doi.org/10.5753/brasnam.2016.6445.

Mena-Chalco, J. P.; Cesar Junior, R. M. Scriptlattes: an opensource knowledge extraction system from the lattes platform. Journal of the Brazilian Computer Society, v. 15, n. 4, p. 31-39, 2009. Doi: https://doi.org/10.1007/BF03194511.

Menon, A. K.; Elkan, C. Link prediction via matrix factorization. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Berlin: Springer, 2011. p. 437-452. Doi: https://doi.org/10.1007/978-3-642-23783-6_28.

Newman, M. E. J. Coauthorship networks and patterns of scientific collaboration. Proceedings of the National Academy of Sciences, v. 101, n. 1, p. 5200-5205, 2004. Doi: https://doi.org/10.1073/pnas.0307545100.

Newman, M. E. J. Mixing patterns in networks. Physical Review E, v. 67, n. 2, p. 026126, 2003. Doi: https://doi.org/10.1103/PhysRevE.67.026126.

Newman, M. E. J. Networks: an introduction. Oxford: Oxford University Press, 2010. Available at: https://dl.acm.org/doi/book/10.5555/1809753. Accessed on: May 10, 2020.

Newman, M. E. J. The structure of scientific collaboration networks. Proceedings of the National Academy of Sciences, v. 98, n. 2, p. 404-409, 2001. Doi: https://doi.org/10.1073/pnas.98.2.404.

Newman, M. E. J; Park, J. Why social networks are different from other types of networks. Physical Review E, v. 68, n. 3, p. 036122, 2003. Doi: https://doi.org/10.1103/PhysRevE.68.036122.

Perez-Cervantes, E. Análise de redes de colaboração científica: uma abordagem baseada em grafos relacionais com atributos. 2015. Dissertação (Mestrado em Ciência da Computação) –Universidade de São Paulo, São Paulo, 2015. Doi: https://doi.org/10.11606/D.45.2016.tde-18122015-114014.

Perez-Cervantes, E. et al. Using Link Prediction to Estimate the Collaborative Influence of Researchers, 2013. In: IEEE 9th International Conference on e-Science, 2013, Beijing. Proceedings online […]. Beijing: IEEE Computer Society, 2013. p. 293-300. Doi: https://doi.org/10.1109/eScience.2013.32.

Potgieter, A. et al. Temporality in link prediction: understanding social complexity. Emergence: Complexity & Organization (E: CO), v. 11, n. 1, p. 69-83, 2009. Available at: https://aisel.aisnet.org/sprouts_all/195. Cited: May 10, 2020.

Ren, T. et al. Identifying vital nodes based on reverse greedy method. Scientific Reports, v. 10, n. 1, p. 1-8, 2020. Doi: https://doi.org/10.1038/s41598-020-61722-8.

Rolf, H. Identifying the collaboration styles of research students. Proceedings of the Association for Information Science and Technology, v. 56, n. 1, p. 750-751, 2019. Doi: https://doi.org/10.1002/pra2.160.

Shakibian, H.; Charkari, N. M. Mutual information model for link prediction in heterogeneous complex networks. Scientific Reports, v. 7, e44981, 2017. Doi: https://doi.org/10.1038/srep44981.

Zhang, P. et al. The reconstruction of complex networks with community structure. Scientific Reports, v. 5, n. 1, p. 1-11, 2015. Doi: https://doi.org/10.1038/srep17287.

Downloads

Published

2022-07-25

How to Cite

Affonso, F., Santiago, M. de O., & Rodrigues Dias, T. M. (2022). Analysis of the evolution of scientific collaboration networks for the prediction of new co-authorships. Transinformação, 34, 1–15. Retrieved from https://periodicos.puc-campinas.edu.br/transinfo/article/view/6473

Issue

Section

Data and Information in Online Environments