Track your comments!
[x]


When you register, comments on your articles and replies to your comments appear here. Register Now!

Sign in to your account
[x]

Not a Scientific Blogging member yet?

Register Now for a Free Scientificblogging.com Account

  • Customize your profile with pictures, banner, a blogroll and more.
  • Leave comments on articles, add other members to your friend lists, chat with people on the site.
  • Write blog posts that can be seen by hundreds of thousands of readers.

It's free and it only takes a minute!

Already a Scientific Blogging member?

Sign In Now

Banner
By Michael White | June 26th 2009 05:26 PM | 6 comments | Print | E-mail | Track Comments
.

More Adaptive Complexity articles

All

About Michael White

Welcome to Adaptive Complexity, where I write about genomics, systems biology, evolution, and the connection between science and literature, government, and society.

I'm a biochemist


... Full Bio

An interesting phenomenon in growing random networks:

The number of 3-node, 3-edge connected subgraphs in a random, scale-free network of N nodes scales as N0 (=1). No matter how big your network grows, you're going to have a roughly constant number of 3-node, 3-edge  subgraphs that depends only on the ratio of edges to nodes.

Let's back up a minute before we see why this counterintuitive result is so and what it means.  Imagine that we have a network made up of N nodes connected by E edges. You can start out with two nodes connected by one edge:



Your edge-node ratio here is obviously 0.5. Let's say that you gradually increase the size of your network, adding more edges and nodes (assigning the connections randomly), but always keeping an edge-to-node ratio of 0.5.

As your network grows, you start seeing subgraphs - subsets of three nodes connected by two edges, and three nodes connected by three edges. (As you network grows, you'll naturally also start seeing other subgraphs, with 4 nodes, etc.)

The number of V subgraphs (three node, two edge) scales linearly with the size of your network, as you might intuitively expect: the bigger your network, the more likely you are to find two nodes each connected to a third node.




But here's the rub: the number of triangle subgraphs, (three nodes, three edges) stays constant, no matter how large your network grows. 




Why? Because, while your V subgraphs scale with N, the likelihood of closing that V subgraph with a third edge (and thus forming a triangle) scales as 1/N.  As N grows, it becomes more and more unlikely that a randomly-assigned edge will connect those two nodes needed to close the triangle. So you network can get huge, the number of V nodes gets huge, but, as long as the edge/node ratio stays the same and you're assigning your edges randomly, the number of triangles will not substantially increase.

Why does this matter? Because triangle subgraphs (called feed-forward loops) crop up everywhere in biological networks. Various systems biologists, in particular the Weizmann Institute's Uri Alon, have used this fact to argue that feed-forward loops are doing something special in genetic networks. They crop up much, much more frequently than you would expect in a network with randomly assigned edges.

Of course we know that, like social networks, genetic networks don't have randomly assigned edges. It could be that feed-forward loops don't serve any particularly special function; it could be just an artifact of how these networks are built: if gene A is connected to gene B and gene C, maybe (through some obscure evolutionary process), gene C is very likely to end up connected to gene B, thus forming a triangle. We can see that this is obviously true in social networks: my friends have a high probability of knowing my other friends; that's just how social interactions work.

In biology, there is some evidence, however, that these feed-forward loops are more than just an artifact of making biased connections: there are some clear examples of this particular wiring diagram performing useful functions, functions well-suited to feed-forward loops. In other words, when you see a feed-forward loop in a genetic network, you already have a good idea of what the function of this loop might be. (Although of course you still have to go and test the function of the feed-forward loop to find out for sure.)

In the near future, I'll come back to what feed-forward loops can do. For the time being, here is the much more rigorous version of subgraph scaling (PDF):

Subgraphs in Random Networks, Itzkovitz, Milo, Kashtan, Ziv, and Alon, 2003.



Comments

jtwitten's picture
Is this result achieved using truely random assignment of edges (all nodes have equal probability of being connected to a new edge) or scale-free network building (probability of new edge being added to a node is directly proportional to the number of edges that the node is already connected to)?

adaptivecomplexity's picture
A good point to bring up - this result applies to scale-free networks, and not to uniformly random networks (which, instead of a scale-free connectivity distribution, have a Poisson distribution). Maybe something like this applies to uniformly random networks, but I don't know.

So another important question is whether various networks really are scale free - some of the initial conclusions about biological networks have been drawn from noisy high-throughput data, and some researchers have questioned whether genetic networks are scale-free.


jtwitten's picture
Shhh.  They might hear you.  "Everyone" knows that "all" important networks are scale-free.  Important people have said so.  Scale-free networks are robust, even to criticism.

adaptivecomplexity's picture
Scale-free networks are robust, even to criticism.

!!!!! I'm going to have to remember that one; that's perfect!

Granted that I don't know a lot about this stuff (though I want to learn more), it doesn't seem to make sense to me that all genetic networks are scale free. Sure, I see that edges could be limited by expression levels, but there must be genes that are highly overexpressed once they've been turned on, thus having the ability to generate many connections without penalty. Or are there basally transcribed genes that just sort of hang about to be used as needed and thus can interact with any number of other genes? I guess I should go read the actual paper now...

adaptivecomplexity's picture
One of the ideas regarding why has to do with robustness (as Josh hinted at) - networks with a few highly connected hubs, and many sparsely connected nodes, are less vulnerable to mutation - unless you take out a hub, things don't change much.  That's theoretically true of networks, but whether that actually is the case in biological networks has to be tested.
highly overexpressed once they've been turned on, thus having the ability to generate many connections without penalty.

If the connections don't matter (because the regulated gene is already highly expressed anyway), then they won't be preserved by selection, and will thus gradually be eroded by mutation.

Add a comment

The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <sup> <sub> <a> <em> <strong> <center> <cite> <code> <TH><ul> <ol> <li> <dl> <dt> <dd> <img> <br> <p> <blockquote> <strike> <object> <param> <embed> <del> <pre> <b> <i> <table> <tbody> <div> <tr> <td> <h1> <h2> <h3> <h4> <h5> <h6> <hr> <iframe>
  • Lines and paragraphs break automatically.
  • Web page addresses and e-mail addresses turn into links automatically.
CAPTCHA
If you register, you will never be bothered to prove you are human again. And you get a real editor toolbar to use instead of this HTML thing that wards off spam bots.