Vic, Sergey, Google+ and the logarithmic derivative

Back in 2011, Vic Gundotra and Sergey Brin appeared at the Web 2.0 Summit and talked about Google, more specifically Google+.
You can view the video of that interview on Youtube or below.

At the time, Google+ was—and still is at the time of writing—one of the most innovative social networks on earth and the first social network platform without users (oo). It’s even kind of weird to call it social.
One thing I am amazed by is the way managers and presidents like Vic can make sad statements and still look cool, making people believe that the product is actually quite amazing.
They claim that the 40 million users per month figure was a well-above their expectation without putting it in context. It is indeed a large number of users, which any company would love to have in one month. Unfortunately, Vic and Sergey are not considering the logarithmic derivative. Or rather, they are choosing not to discuss it.

Let me describe it in piggy terms.

Let the growth of new subscribers to Google+ be, as they claim, g'(x)=+40mln per month.
But if we assume that the initial number of available users subscribing to the new service is represented by g(x), that the relative rate of growth should be computed according to the logarithmic derivative given by \frac{g'(x)}{g(x)} and that g(x) is actually a very large number, there’s absolutely nothing to be proud of.
An increase of 40 million customers for a company with, let’s say, 100 million customers has a relative growth rate of 40%
But the relative growth rate for a company with 1 billion, such as Google, would be about 4%.

Now Vic, are you still impressed?

(oo)

http://www.rosewaterandthyme.com/2011/06/chocolate-truffles/

Leave a Comment

Filed under General

Complexity is simple

Everybody knows that math is cool. I still have to think a bit when I’m asked whether I’d prefer a math book, or a veggie sandwich with a can of coke and a piece of chocolate cake.
An amazing thing I’ve noticed after studying the dynamical systems theory is that complexity is actually not complex at all. Whenever I think about how complexity is generated I realise that it often comes from very simple rules or very simple equations that regulate quantities in a very simple fashion.

It’s the aggregation of these simple effects that makes the system, well, complex. So complex that it becomes really hard to study it to make predictions about how its components will evolve in time.
Let me first define complexity. I have the (bad) habit of associating complexity with chaos, even though I am aware that chaotic systems are just a special type of complex systems. But since I don’t want to annoy you with useless formalities, for the purpose of this particular post only, I shall use complex and chaotic interchangeably.

I want to show two types of systems that caused me dwell on the topic: One is in the field of systems of ordinary differential equations and the other is a rule-based system.

Thanks to http://www.rosewaterandthyme.com for the tasty picture

A system of ordinary differential equations (ODE) is basically formed by ODEs that must be solved “all together” (or at the same time). Whenever we think of chaotic systems, we imagine very large systems of high order differential equations that may, at times, require an entire A4 paper just for them written down and many more for actually solving them. That’s actually wrong since chaotic behaviour can arise from a very simple system of ODEs.
The Rossler attractor is one of those in which chaos is generated by only three differential equations (and by setting its parameters accordingly). I wrote about this a while ago.

\frac{dx}{dt} = -y - z\\ \frac{dy}{dt} = x + ay\\ \frac{dz}{dt} = b + z(x-c)

Another one is the Lorenz attractor which generates the so-called chaotic butterfly.

\frac{dx}{dt} = \sigma (y - x)\\ \frac{dy}{dt} = x (\rho - z) - y\\ \frac{dz}{dt} = xy - \beta z

Another complex behaviour, not necessarily chaotic, is represented by self-sustaining systems. Such systems present a very interesting phase plane—a visual display to plot differential equations that form the system)—in which cycles appear. A cycle determines an oscillatory behaviour. This means that under specific assumptions (parameters) and initial conditions, the system will live without any other external intervention. Amazing, isn’t it?

A recurrent pattern in systems biology is represented by the system of ODEs below.

\frac{dX}{dt} = \alpha_1 -\beta_1 X\frac{Z^n}{{K_1}^n+Z^n} \\  \frac{dY}{dt} = \alpha_2(1-Y)\frac{X^n}{{K_2}^n+X^n}-\beta_2 Y \\  \frac{dZ}{dt} = \alpha_3(1-Z)\frac{Y^n}{{K_3}^n+Y^n}-\beta_3Z

This is the model of the Xenopus embryonic cell cycle , which shows limit cycle oscillations.  The system, that seems to be complicated, is actually very simple if you take into account the fact that the fractions on the right side of each equation form basically a Hill function of the three variables Z,X,Y respectively.
Although this is just a model that might have been thought for the sake of reaching complexity, it really explains what happens in a wet lab.
Another class of systems that can generate very different behaviours are rule-based systems.

I remember I had a lot of fun with the Conway’s game of life, when I was a little pig.

I was literally hypnotised when observing how a trivial initial condition and two simple rules could generate periodic and even chaotic patterns.

It really seems that chaos and complexity are everywhere, since they are so easy to achieve. I take comfort in knowing that mathematicians, and pigs of course, have the tools to manage it.

Off to my chocolate cookies
(oo)

Leave a Comment

Filed under General

Communication intensity

Hello Folks,
in my super busy schedule I still find time to post about my findings. In the meantime my room mate is approaching the end of his PhD. or just seeing the light at the end of the tunnel. We are both waiting for the notification of two papers that apparently might be conclusive to his doctorate work.
Ok cutting the b**shit here I am with another post…
In the era of communication and “extreme” technology – I would say extremely cheap technology – people have many more chances to talk or poke each other (definitely more than in the past), and most of the time they do so because it costs nothing. Once upon a time they texted, then sent emails, now they add each other from social network platforms, call via skype or phone call much more because it is actually cheap. I still don’t have my surname attached to the entry phone because people prefer to reach me in other ways.
A recent study that attracted me so much is the one conducted in Belgium in which 2.5 million people mobile traffic has been analysed to conclude that “the communication intensity between two cities is proportional to the product of the city population sizes divided by the square of their distance“.
That is the closer the people the more they call each other (oo)
Mr. Coulomb would be so proud!

Reference
Mauro Martino Projects

Leave a Comment

Filed under General

We need to talk

“Are you saying that you wanna break up?”
“Dear, I found a stable spiral… our story was complex

Leave a Comment

Filed under General

Big data is driving modern research

I planned to write about big data a while ago. But some blogs anticipated me, so I better post it now, before it’s too late. Indeed, big data is hot today and nobody knows how it’ll be in the near future. I have the feeling that big data is a trend that is susceptible to weird expiration dates.
According to several posts I’ve been reading so far and also by looking around, every scientific field is affected by the presence of large amounts of data today. This is mainly due to reasons that are widely accepted like cheaper network and storage infrastructure, social networks and modern tools (both hardware and software) that allow any user to generate tons of data and keep them all in some directory on her local hard drives or even online.
It seems that this phenomenon is growing non linearly with the real need for data, though.
The presence of big data is changing the way we are facing problems too. Statistics is gaining more and more attention as the “Tool” to understand data and reveal the hidden knowledge behind (data mining is just a cool name to teach statistics without lose students after 3 minutes).
In my opinion, the side effect of this phenomenon (or at least that’s what I am experiencing) is that many more researchers are thinking that statistical analysis is the only way to handle large amounts of data, and traditional methods are not (even) considered anymore because they make the problem intractable.
A nice discussion about this has been started on Computational Biology group in Linkedin, in which it is clearly asked why statistical modeling is overtaking partial/ordinal differential equation and hierarchical modeling, regarded to genomic, proteomic and the -omics in general. Especially in computational biology it seems, indeed, that the only way to “read” massive data is statistics.
But while this revealed to be true for special types of application that study social consensus, social behaviour, economic and other human related scenarios, there are still research fields where big data are actually not bringing improvement with the same impact. How is that?
Probably that’s because more data don’t always mean more information. It could also mean more noise and many more issues to deal with. In the human genome research field, Next Generation Sequencing (NGS) is feeding reseachers with many more issues than they expected, when everybody thought that collecting more data in a fraction of the time needed by older technologies would have solved things. That clearly is not the case (unfortunately).
That’s how I am a bit worried of the role that data driven methods are playing and would be happier if the abundance of data would be considered as a support to intuitive models. Not just a complete substitute.

An interesting article I found about big data

sweets

Leave a Comment

Filed under General

Selling crap for gold

Hello folks! It’s quite a long time I haven’t dropped some lines. Academic work can have peaks that will prevent you even from eating, and you know how bad that could be for a pig like me.
In this post I would like to relax a bit, forgetting about math and share with you one of my recent discussions I had with a professional.
Do you remember my last trip to Prague? No?
Indeed you can’t. I never wrote about that.
Well, I’ve been to Prague with an old friend of mine and on our way back we were having a discussion about… economy. Let’s say that we were a bit bored and started talking about the impact of an idea into the market. It was a pretty hard discussion, since my friend, a telecom engineer, is now attending a master in business administration and he feels so excited about that, he usually talks in a way I’m not comfortable with.
We tried to answer the question “what’s easier to sell: a crappy idea or one that might have potential but needs some kind of optimization or package?” We came across two different opinions and we were almost fighting for that.
My friend came to the conclusion that a crappy idea might be easier to sell while a cool one might be difficult to understand. The impact of the latter into marketplace would be underrated and the final product could end up to a flop. I think he confused crappy with simple.
By the way it was a pretty sound conclusion. But I inverted everything up and defended with saying that when an idea is cool – where cool is to be defined – it might have more impact into marketplace. As a direct consequence its final product might be a lot easier to sell.
Of course I was talking as an engineer. He diserted engineering and started talking like MBAs in opposition to my argument. He continued with statements like “any idea needs a package to be sold”, or “you can sell crap for gold” coming with no real examples…and that was driving me crazy.
I found loads of high impact ideas that found place in the market such as the mp3, Google, Twitter, just not to go to ancient times. These ideas were not crap since the very beginning. They were well designed, with a specific target and they worked whatever the package.
There are examples on my friend’s side though. Indeed I remembered about an article published by Scientific American, I’ve been reading a while ago, that could have answered some of my doubts and probably my friend’s.
Thing is that the way my friend thinks (that is very common among MBA students) has changed the way ideas are spread and monetized. The fact that stakeholders ask for faster and faster returns on investment, forcing engineers to put immature ideas into market cannot be generalized to “selling crap for gold” is a way of doing business that actually works. The main side effect of such an attitude is undervaluing something that is interesting and has potential. But its immature state will convince customers that it is actually crap. Another consequence of this attitude is the so called “Fix It Later Syndrome” by which customers are caves that buy something, give feedback and will probably buy the next version they contributed to build. This actually worked for several opensource projects. But, again, I’m not sure it applies everywhere.

As a conclusion, I don’t talk to that guy anymore
(oo)

Leave a Comment

Filed under General

Naming Game model: simulating social behaviour

Hello Folks!
I am posting this from far away, although I wrote it some time ago. Hope you enjoy my post.
A model that is fascinating me so much is the Naming-Game model (NG). This model is a very interesting way to describe and simulate social behaviour in a general but effective fashion.
Advertisement, political campaigns, television and media influence, marketing are only a short list of application that can be represented by such a model.
The area of research from which this model comes frome is known as opinion dynamics.
According to this model, people speak to each other in a simplified way: a speaker randomly selects a hearer from a list of friends and speaks a randomly selected word from a list of topics. It is that easy.

Two basic rules are then applied to the model:
1) if the hearer doesn’t have the spoken word in his/her dictionary he/she adds it
2) if the hearer has the word, she deletes all the other words in her dictionary but the received one

The evolution of such a system is not easy to predict because it depends on several factors such as how the words are selected, how speakers and hearers interact with each other and the topology of the network.

A paper that inspired me a lot is “Social consensus through the influence of committed minorities“. In this paper the regular NG model is enriched by the presence of committed agents, people who are immune to influence and will only try to turn the entire “society” to consensus.
The study is very interesting because authors found that if the percentage of committed agents is below a critical threshold, the system reaches consensus in time t ~ ln N (where N is the number of nodes). If the percentage of committed agents is higher than the critical threshold the system can reach consensus much faster in t ~ exp(a(p)N).

I found this amazing! This kind of models are normally described in analytical form with two relatively simple differential equations that describe how densities of people evolve in time. Analytical analysis can answer to a lot of questions about the dynamical system.
But in order to see agents interacting with each other I decided to write a simulator that plots the evolution of the system in several frames and creates a movie of the entire simulation (until consensus is reached). I consider this piece of software my new toy that might be useful to answer to some questions that the analytical analysis cannot answer.
For instance, how would you take into account topology? Clearly there are topologies that are most favourable and lead to consensus faster than others (provided the same amount of words and agents)

At the moment it is quite simple since I am ignoring the fact that people can be difficult to convince (this means that rule 2 should be apply with a certain probability).
But I am working on it in my spare time, making it more and more complex, complete and more realistic.

Simulator description
With this simulator I can specify the amount of nodes (regular and committed agents), the amount of random words in dictionaries, the topology of the network (a connection matrix that can be random, fully connected graph, small-world network, or user defined) and the number of seconds and fps of the final video.
Once launched agents will start interacting with each other as they were sharing opinions. Immune nodes will never be influenced by others’ opinions and will only try to convince their neighbors about the topics in their dictionary.
I found the scatter plot a good way to visualize this system. Each circle represents a node.
Colors are used to visualize nodes that are interacting (orange), immune nodes (red), regular nodes (blue). The different size represents the amount of words in node’s dictionary at current time.
Triangle nodes are committed agents. Nodes become red when they have been “persuaded” by committed agents.
At the end of the simulation, if consensus is reached, all nodes are smaller and red. That is they have small dictionaries filled with words that come from immune agents’ dictionaries.
As in the paper above by changing the amount of immune nodes and keeping the same amount of words, agents and topology, consensus is reached in a time that can be logaritmic or exponential. The critical threshold is confirmed by my simulations between 8% and 9%.

Results
For each simulation I also collect information about how main interactions are needed to reach consensus, how the number of words evolves in time and what is the success rate (how many times rule 1 is applied) and failure rate (how many times rule 2 is applied). Finally these values are plot in a graph.

This is the simulation of 200 Agents, 4 of which are immune.

Below are the plots of number-of-words / number-of-iterations for three simulations with 4%, 9% and 16% immune agents respectively.

Here is the simulation of 10000 Agents, 200 of which are immune.

If you are interested in the source code or if you want to improve it or just give me some hints, I would be glad to receive an email at worldofpiggy@gmail.com

(oo)

2 Comments

Filed under General