L'EQ Score indique l'activité et l'attractivité de l'entreprise sur la plateforme : complétude du profil, actualités, likes, follows, appartenance à un ou plusieurs labels ou communautés, nombre de membres de son réseau, etc.
Data is the new oil, but Big Data analytics remains a luxury good. Kido Dynamics democratizes Big Data, providing the most advanced insights to make cities smarter.
Lausanne, Vaud, Switzerland
Send a message
(Preamble Note: most of the content of this article is either common knowledge, or strong assumptions broadly accepted by academics and professionals. What is totally unique is the way we are actually somehow confirming that knowledge).
Spain has been for centuries a crossroad of cultures and people, starting with Phoenician, Greeks, Romans, Germanic tribes, Arabs, etc. Without diminishing their impact, in Spain, as well as in most of European countries, Romans left the most relevant and permanent footprint. We have heritage not only our language, but our law, sewage and hydraulic systems, harbors… and one element which still defines the way we communicate and interact with our national peers: road networks.
When Roman Empire conquered a new land, they would rush to build a meaningful and dense network of roads for different purposes, but mostly for (not exhaustive):
We can undoubtedly recognize that Romans had a clear view about how infrastructure would play a strong integrative and cohesive role in their territory.
TWO THOUSAND YEARS AFTER
Living in the geek economy, in a multiconnected society, with low cost flights, 5G connectivity, 4K streaming… we might think about Roman economic model as old fashion. Or maybe not.
At Kido Dynamics, we are driven by a strong academic and research purpose, not only to make our solutions meaningful, relevant and useful for today’s society, but also to understand how certain decisions that we might be taking today could impact and define the future of our children tomorrow… to the same extent that we are still impacted by what Romans did 2’000 years ago.
In that context, we have completed the largest and most extensive mobility analysis ever done Worldwide in a given region, in this case Spain, with more than 4 billion trips analyzed over the last 12 months . Most of the breakthroughs we have unveiled will be publicly available in the coming weeks, but I wanted to share some interesting insights we have discovered while diving through terabytes of data.
THE SOCIAL NETWORK PERSPECTIVE
In today’s digital ecosystem, social networks are an essential component of our day to day, and we concentrate a big share of our social interactions in the digital sphere.
Furthermore, at Kido we develop most of our day to day work on a pure digital context (actually we don’t even know personally most of our clients). But we are physicists and engineers by trainee, and we generally need evidence about how accurate our assumptions and models are versus reality. (Just in case you missed what we actually do, we describe and predict human behaviour applying mathematical methods and procedures inspired in physics, as the concept of maximum entropy in thermodynamics or the eigenstates and propagators of quantum mechanics ).
So, what if (certain) human behaviours could also be explained with other models, such as social networks or virality effects?
In this context, we defined a theorical framework that would allow us to simplify some of our initial assumptions and extract relevant insights from our massive set of data. Since P2P communications would be too granular for a broad base analysis (people interact in average with a maximum of 150 people ), we decided to aggregate by administrative units: municipalities. Sample size: more than 15M users distributed in 8’131 municipalities in Spain.
As described previously, the base of our work was the number of trips that people from every municipality performed over the 12 months of 2018, so we classified all municipalities based on the number of trips that each municipality generated or received from any other municipality. Under this context, we play the next game: we define that a municipality is a follower of another as long as there were more than 10 trips originating in its territory with the second municipality as destination. The more municipalities connected, more followers, and more popular. In this social network, who are the influencers?
To our surprise, more population does not necessarily implies being popular. There is a correlation indeed, but we can find remarkable cases such as San Fernando de Henares which is higher in the ranking than other capital cities with more than six times its population. A reader familiar with the Spanish geography will immediate find that San Fernando, as a satellite city of the country’s capital (Madrid) has a privileged location both geographical and in terms of infrastructure. And this is totally true. However, this same reader would also think that L’Hospitalet (55th, 2’532 followers), a member of Barcelona’s metropolitan area with a density of population double than that of Manhattan, should also be in the top of influencers. But it is not. The underlying dynamics is complex and sometimes unexpected.
If we represent the total number of trips received per municipality, we find interesting patterns:
It is well-known that the area of municipalities in the south are bigger than those in the north for historical reasons (same effect as the US counties from East to West) so one may think that it is related with the fact that the bigger the area, the bigger the population. But things, as usual, are more complex: this is the so call España vacía or Empty Spain, as can be seen in the population density map of the country below. Ironically, Spain is one of the European countries with the lowest overall density of population, but it also owns the city with the largest density of population of all Europe. Spain is complex.
Since we had a nice level of disaggregation, we decided to dig deeper and analyzed for multiple municipalities of difference size, not only the number of connection or followers they had, but also how these followers were distributed across the geography.
Case 1: Madrid
Not many comments in this case. The largest metropolitan area in Spain is systematically connected with any point of the territory, which is also favored thanks to its central position.
Case 2: Barcelona
The second largest metropolitan area in Spain, has strong bias in terms of communications towards surrounding heavily populated regions such as Valencia, Alicante, Zaragoza and Madrid, while exchange and trips the West part of the country is barely testimonial.
Case 3: Bilbao
An important industrial pole in the North with strong influence in the surrounding region (Basque country, Navarra, Cantabria, Palencia, Burgos) and strong communication links with Madrid, Zaragoza and Valladolid.
Case 4: Orense
Orense represents a typical regional hub, with strong connections with surrounding region, but very limited (or inexistent connections) with the rest of the territory.
With a difficult orography, Galicia region has always been somehow isolated from the rest of Spain, developing its own local economy ecosystem. It’s important to note that many of these out of the region connections are happening across and along three main communication axis: AP-6 (Valladolid and Madrid), A-8 (Asturias) and A-66 (Gijón/Badajoz).
Case 5: Siruela
We chose for out last example a 2’000 inhabitants’ village in the so call Empty Spain. We can clearly observe that:
Important to note: travel distance from Siruela to the closest speedway (nearby Don Benito) is 90 km or 60 min, while travel distance to Madrid is 3 h 20 min for 274 km. Similar time to Madrid-Valencia with almost 100 km more.
Not surprisingly, Siruela’s neighbors visit Madrid, Badajoz (2h 12 min) or Córdoba (2h 13 min), but not even consider to visit Sevilla, Valencia, Barcelona or Bilbao.
It is important to appreciate how infrastructure acts as a territory integrator, facilitating access to multiple locations in a safe and fast way. There are multiple studies and analysis regarding this phenomenon, but in most of the cases are based in a limited/reduced sample and concentrate in a small region or metropolitan area.
To our understanding, this is the largest and most exhaustive analysis ever done in a country to understand how infrastructure layout underpins communications, trade and exchange among municipalities, regions and ultimately, people. (Utilising number of trips as a basic, simplify proxy for this analysis)
In that context, we are still bearing what Romans decided 2’000 years ago, since most of existing speedways are just reflecting the ancient Roman calzadas, with better bridges, tunnels and security, but they still connect the same points defined 20 centuries ago.
We need to ask ourselves if many of the decisions related to infrastructure and mobility investment we are taking today are based in what we consider optimal for today’s needs, or if they are still based in assumptions and mindsets that we have inherited, while we have a massive, powerful and limitless set of data which might help us to optimize (and rethink) this process.
So what will you do next time you have to invest billions in a new speedway to dynamize a region? Trust Romans or trust Data?
At this point, I will omit any explanations related to anonymization, aggregation, privacy and GDPR compliance for the sake of simplicity. If you want to know more, please post me. I will just say that our analysis is based on more than 15M users, and that peer to peer communications are not considered or even analyzed, since they are not relevant/useful for our purposes
The workings of the maximum entropy principle in collective human behavior. A Hernando, R Hernando, A Plastino, AR Plastino. Journal of The Royal Society Interface 10 (78), 20120758
Unravelling the size distribution of social groups with information theory in complex networks. A Hernando, D Villuendas, C Vesperinas, M Abad, A Plastino. The European Physical Journal B 76 (1), 87-97
We do not include those municipalities with less than 300 inhabitants (which are mostly those with less than 10 trips per connection) to protect the privacy of the users by working with large aggregated values only.