Online Social Networks and Media

Publish in

Documents

5 views

Please download to get full document.

View again

of 63
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Share
Description
Online Social Networks and Media . Homophilly Networks with Positive and Negative ties. Chapter 4, from D. Easley and J. Kleinberg book. HOMOPHILLY. Introduction. Surrounding context : factors other than node and edges that affect how the network structure evolves.
Transcript
Online Social Networks and Media HomophillyNetworks with Positive and Negative tiesChapter 4, from D. Easley and J. Kleinberg bookHOMOPHILLYIntroductionSurrounding context: factors other than node and edges that affect how the network structure evolvesHomophily: people tend to be similar to their friendsΑριστοτέληςlove those who are like themselvesΠλάτωναΌμοιος ομοίω αεί πελάζει (similarity begets friendship)Birds of a feather flock togetherFactors intrinsic to the network (introduced by a common friend) and contextual factors (eg attend the same school) HomophilyMiddle – High SchoolRaceMeasuring HomophilyIf the fraction of cross-gender edges is significantly less than expected, then there is evidence for homophilygender male with probability pgender female with probability qProbability of cross-gender edge? Measuring Homophily
  • “significantly” less than
  • Inverse homophily
  • Characteristics with more than two values:
  • Number of heterogeneous edges (edge between two nodes that are different)
  • Mechanisms Underlying Homophily: Selection and Social InfluenceSelection: tendency of people to form friendships with others who are like thenSocialization or Social Influence: the existing social connections in a network are influencing the individual characteristics of the individualsSocial Influence as the inverse of SelectionMutable & immutable characteristicsThe Interplay of Selection and Social InfluenceLongitudinal studiesin which the social connections and the behaviors within a group are tracked over a period of time
  • Why?
  • Study teenagers, scholastic achievements/drug use (peer pressure and selection)
  • Relative impact?
  • - Effect of possible interventions (example, drug use)
  • The Interplay of Selection and Social InfluenceChristakis and Fowler on obesity, 12,000 people over a period of 32-yearsPeople more similar on obesity status to the network neighbors than if assigned randomlyWhy?(i) Because of selection effects, choose friends of similar obesity status,(ii) Because of confounding effects of homophily according to other characteristics that correlate with obesity(iii) Because changes in the obesity status of person’s friends was exerting an influence that affected her(iii) As well -> “contagion” in a social senseTracking Link Formation in Online Data: interplay between selection and social influence
  • Underlying social network
  • Measure for behavioral similarity
  • WikipediaNode: Wikipedia editor who maintains a user account and user talk pageLink: if they have communicated with one writing on the user talk page of the otherEditor’s behavior: set of articles she has edited Neighborhood overlap in the bipartite affiliation network of editors and articles consisting only of edges between editors and the articles they have editedFACT: Wikipedia editors who have communicated are significantly more similar in their behavior than pairs of Wikipedia editors who have not (homomphily), why?Selection (editors form connections with those have edited the same articles) vs Social Influence (editors are led to the articles of people they talk to)Tracking Link Formation in Online Data: interplay between selection and social influence
  • Actions in Wikipedia are time-stamped
  • For each pair of editors A and B who have ever communicated,
  • Record their similarity over time
  • Time 0 when they first communicated -- Time moves in discrete units, advancing by one “tick” whenever either A or B performs an action on Wikipedia
  • Plot one curve for each pair of editors
  • Average, single plot: average level of similarity relative to the time of first interaction
  • Similarity is clearly increasing both before and after the moment of first interaction(both selection and social influence)Not symmetric around time 0 (particular role on similarity): Significant increase before they meetBlue line shows similarity of a random pair (non-interacting)AffiliationA larger network that contains both people and context as nodesfociAffiliation networkBipartite graphA node for each person and a node for each focusAn edge between a person A and focus X, if A participates in XAffiliationExample:Board of directors
  • Companies implicitly links by having the same person sit on both their boards
  • People implicitly linked by serving together on a aboard
  • Other contexts, president of two major universities and a former Vice-President
  • Co-evolution of Social and Affiliation NetworksSocial Affiliation NetworkTwo type of edges:Friendship: between two peopleParticipation: between a person and a focus
  • Co-evolution reflect the interplay of selection and social influence: if two people in a shared focus opportunity to become friends, if friends, influence each other foci.
  • Co-evaluation of Social and Affiliation Networks: Closure processTriadic closure: (two people with a friend in common - A introduces B to C)Membership closure: (a person joining a focus that a friend is already involved in - A introduces focus C to B) (social influence)Focal closure: (two people with a focus in common - focus A introduces B to C) (selection)Co-evaluation of Social and Affiliation NetworksTracking Link Formation in Online Data: triadic closure
  • Triadic closure:
  • How much more likely is a link to form between two people if they have a friend in common
  • How much more likely is a link to form between two people if they have multiple friends in common?
  • Tracking Link Formation in Online Data: triadic closureTake two snapshots of the network at different timesFor each k, identify all pairs of nodes that have exactly k friends in common in the first snapshot, but who are not directly connectedDefine T(k) to be the fraction of these pairs that have formed an edge by the time of the second snapshotPlot T(k)as a function of kT(0): rate at which link formation happens when it does not close any triangleT(k): the rate at which link formation happens when it does close a triangle (k common neighbors, triangles)Tracking Link Formation in Online Data: triadic closureE-mail (“who-talks-to-whom” dataset type)Among 22,000 undergrad and grad students (large US university)For 1-year
  • Network evolving over time
  • At each instance (snapshot), two people join, if they have exchanged e-mail in each direction at some point in the past 60 days
  • Multiple pairs of snapshots ->
  • Built a curve for T(k) on each pair, then average all the curves
  • Snapshots – one day apart (average probability that two people form a link per day)
  • From 0 to 1 to 2 friendsFrom 8 to 9 to 10 friend (but on a much smaller population)
  • Almost linear
  • Having two common friends produces significantly more than twice the effect compared to a single common friend Tracking Link Formation in Online Data: triadic closureBaseline model:Assume triadic closure:Each common friend two people have gives them an independent probability p of forming a link each dayFor two people with k friend in common, Probability not forming a link on any given day (1-p)kProbability forming a link on any given dayTbaseline(k) = 1 - (1-p)kGiven the small absolute effect of the first common friend in the dataTbaseline(k) = 1 - (1-p)k-1Qualitative similar (linear), but independent assumption too simpleTracking Link Formation in Online Data: focal and membership closureFocal closure: what is the probability that two people form a link as a function of the number of foci that are jointly affiliated withMembership closure: what is the probability that a person becomes involved with a particular focus as a function of the number of friends who are already involved in it?Tracking Link Formation in Online Data: focal closureE-mail (“who-talks-to-whom” dataset type)Use the class schedule of each studentFocus: class (common focus – a class together)A single shared class same effect as a single shared friend, then differentSubsequent shared classes after the first produce a diminishing returns effectTracking Link Formation in Online Data: membership closureNode: Wikipedia editor who maintains a user account and user talk pageLink: if they have communicated by one user writing on the user talk page of the otherFocus: Wikipedia articleAssociation to focus: edited the articleAgain, an initial increasing effect: the probability of editing a Wikipedia article is more than twice as large when you have two connections into the focus than one
  • Also, multiple effects can operate simultaneously
  • A Spatial Model of Segregation
  • Formation of ethnically and racially homogeneous neighbors in cities
  • A Spatial Model of Segregation: The Schelling Model
  • Simple model at a local level
  • Population of individuals called agents, of type X or type O
  • Agents reside in cells of a grid
  • Neighbor cells that touch it (including diagonal)
  • Possible to show as a graph, but use geometric grid
  • A Spatial Model of Segregation: The Schelling ModelThreshold t: Each agent wants to have at least t agents of its own type as neighborsIf an agent discovers that fewer than t of its neighbors are of the same type of itself, then it has an interest to move to a new cellUnsatisfied (shown with *)t = 3A Spatial Model of Segregation: The Schelling ModelAgents move in sequence of rounds: In each round, consider unsatisfied agents at some order, move to an unoccupied cell where it will be satisfiedHow to move? (in a random order, downwards?) Where to move? what if no satisfying position?t = 3. one row at a time working downwards, agent moves to the nearest cell that will make it satisfiedA Spatial Model of Segregation: The Schelling Model
  • Simulation, unsatisfied agents move to a random location
  • ~50 rounds, all satisfied
  • Different random starts
  • Large homogeneous regions
  • A Spatial Model of Segregation: The Schelling ModelSpatial segregation is taking place even if no individual agent is actively seeking it For t = 3, satisfied even in the minority among its neighborsRequirements not globally incompatibleIf we start from a random configuration, attach to clusters, grow, some fall below, move, “unraveling” of more integrated regions A Spatial Model of Segregation: The Schelling ModelEnd of Chapter 4
  • Homophily (selection vs social influence)
  • Graphs with more than one type of nodes (bipartite)
  • Affiliation networks
  • Spatial model of segregation
  • Chapter 5, from D. Easley and J. Kleinberg bookNetwork with Positive and Negative TiesStructural BalanceWhat about negative edges?Initially, a complete graph (or clique): every edge either + or -
  • Let us first look at individual triangles
  • Lets look at 3 people => 4 Cases
  • See if all are equally possible (local property)
  • Structural BalanceCase (a): 3 +Case (b): 2 +, 1 -A is friend with B and C, but B and C do not get well togetherMutual friendsCase (c): 1 +, 2 -Case (d): 3 -Mutual enemiesA and B are friends with a mutual enemyStructural BalanceCase (a): 3 +Case (b): 2 +, 1 -Stable or balancedUnstableA is friend with B and C, but B and C do not get well togetherImplicit force to make B and C friends (- => +) or turn one of the + to -Mutual friendsCase (c): 1 +, 2 -Case (d): 3 -Stable or balancedUnstableMutual enemiesForces to team up against the third (turn 1 – to +)A and B are friends with a mutual enemyStructural Balance A labeled complete graph is balanced if every one of its triangles is balancedStructural Balance Property: For every set of three nodes, if we consider the three edges connecting them, either all three of these are labeled +, or else exactly one of them is labeled – (odd number of +)What does a balanced network look like?The Structure of Balanced NetworksBalance Theorem: If a labeled complete graph is balanced, all pairs of nodes are friends, orthe nodes can be divided into two groups X and Y, such that every pair of nodes in X like each other, every pair of nodes in Y like each other, and every one in X is the enemy of every one in Y.Proof ...From a local to a global propertyApplications of Structural Balance
  • How a network evolves over time
  • Political science: International relationships (I)
  • The conflict of Bangladesh’s separation from Pakistan in 1972 (1)USSRN. Vietnam-PakistanUSA---+BangladeshChina-IndiaUSA support to Pakistan?Applications of Structural Balance
  • International relationships (I)
  • The conflict of Bangladesh’s separation from Pakistan in 1972 (II)+USSRN. Vietnam-PakistanUSA--+-BangladeshChina-IndiaChina?Applications of Structural Balance
  • International relationships (II)
  • A Weaker Form of Structural BalanceAllow thisWeak Structural Balance Property: There is no set of three nodes such that the edges among them consist of exactly two positive edges and one negative edgeA Weaker Form of Structural BalanceWeakly Balance Theorem: If a labeled complete graph is weakly balanced, its nodes can be divided into groups in such a way that every two nodes belonging to the same group are friends, and every two nodes belonging to different groups are enemies.Proof …A Weaker Form of Structural BalanceTrust, distrust and online ratingsEvaluation of products and trust/distrust of other usersDirected GraphsAACC-+-+BBA trusts B, B trusts C, A ? CA distrusts B, B distrusts C, A ? CIf distrust enemy relation, +A distrusts means that A is better than B, -Depends on the applicationRating political books orConsumer rating electronics productsGeneralizing Non-complete graphs Instead of all triangles, “most” triangles, approximately divide the graphWe shall use the original (“non-weak” definition of structural balance)Structural Balance in Arbitrary Graphs
  • Thee possible relations
  • Positive edge
  • Negative edge
  • Absence of an edge
  • Balance Definition for General GraphsBased on triangles (local view)Division of the network (global view)A (non-complete) graph is balanced if it can be completed by adding edges to form a signed complete graph that is balanced-+Balance Definition for General Graphs+Balance Definition for General GraphsBased on triangles (local view)Division of the network (global view)A (non-complete) graph is balanced if it possible to divide the nodes into two sets X and Y, such that any edge with both ends inside X or both ends inside Y is positive and any edge with one end in X and one end in Y is negativeThe two definition are equivalent:An arbitrary signed graph is balanced under the first definition, if and only if, it is balanced under the second definitionsBalance Definition for General GraphsAlgorithm for dividing the nodes?Balance CharacterizationWhat prevents a network from being balanced?
  • Start from a node and place nodes in X or Y
  • Every time we cross a negative edge, change the set
  • Cycle with odd number of negative edgesBalance Definition for General GraphsCycle with odd number of - => unbalancedIs there such a cycle with an odd number of -?Balance CharacterizationClaim: A signed graph is balanced, if and only if, it contains no cycles with an odd number of negative edges(proof by construction)Find a balanced division: partition into sets X and Y, all edges inside X and Y positive, crossing edges negative Either succeeds or Stops with a cycle containing an odd number of -Two steps:Convert the graph into a reduced one with only negative edgesSolve the problem in the reduced graphBalance Characterization: Step 11. Find connected components (supernodes) by considering only positive edges2. Check: Do supernodescontain a negative edge between any pair of their nodes (a) Yes -> odd cycle (1)(b) No -> each supernode either X or YBalance Characterization: Step 13. Reduced problem: a node for each supernode, an edge between two supernodes if an edge in the originalBalance Characterization: Step 2
  • Note: Only negative edges among supernodes
  • Start labeling by either X and Y
  • If successful, then label the nodes of the supernode correspondingly
  • A cycle with an odd number, corresponds to a (possibly larger) odd cycle in the original
  • Balance Characterization: Step 2Determining whether the graph is bipartite(there is no edge between nodes in X or Y, the only edges are from nodes in X to nodes in Y)
  • Use Breadth-First-Search (BFS)
  • Start the search at any node and give alternating labels to the vertices visited during the search. That is, give label X to the starting node, Y to all its neighbors, X to those neighbors' neighbors, and so on.
  • If at any step a node has (visited) neighbors with the same label as itself, then the graph is not bipartite (cross-level edge)
  • If the search ends without such a situation occurring, then the graph is bipartite.
  • Why is this an “odd” cycle?Balance CharacterizationGeneralizing Non-complete graphs Instead of all triangles, “most” triangles, approximately divide the graphApproximately Balance Networksa complete graph (or clique): every edge either + or -
  • Claim: If all triangles in a labeled complete graph are balanced, than either
  • all pairs of nodes are friends or,
  • the nodes can be divided into two groups X and Y, such that
  • every pair of nodes in X like each other,
  • every pair of nodes in Y like each other, and
  • every one in X is the enemy of every one in Y.
  • Not all, but most, triangles are balanced
  • Claim: If at least 99.9% of all triangles in a labeled compete graph are balanced,then either,
  • There is a set consisting of at least 90%of the nodes in which at least 90% of all pairs are friends, or,
  • the nodes can be divided into two groups X and Y, such that
  • at least 90% of the pairs in X like each other,
  • at least 90% of the pairs in Y like each other, and
  • at least 90% of the pairs with one end in X and one in Y are enemies
  • Approximately Balance Networks
  • Claim: If at least 99.9% of all triangles in a labeled complete graph are balanced,then either,
  • There is a set consisting of at least 90% of the nodes in which at least 90% of all pairs are friends, or,
  • the nodes can be divided into two groups X and Y, such that
  • at least 90% of the pairs in X like each other,
  • at least 90% of the pairs in Y like each other, and
  • at least 90%of the pairs with one end in X and one in Y are enemies
  • Claim: Let ε be any number, such that 0 ≤ ε < 1/8.If at least 1 – ε of all trianglesin a labeled complete graph are balanced, then either
  • There is a set consisting of at least 1-δ of the nodes in which at least 1-δof all pairs are friends, or,
  • the nodes can be divided into two groups X and Y, such that
  • at least 1-δ of the pairs in X like each other,
  • at least 1-δ of the pairs in Y like each other, and
  • at least 1-δ of the pairs with one end in X and one in Y are enemies
  • Approximately Balance NetworksBasic idea – find a “good” node A (s.t., it does not belong to too many unbalanced triangles) to partition into X and YPigeonhole principle: if n items are put into m pigeonholes with n > m, then at least one pigeonhole must contain more than one itemCounting argument based on pigeonhole: compute the average value of a set of objects and then argue that there must be at least one node that is equal to the average or below (or equal and above)End of Chapter 5Balanced networks in the case of both positive and negative edges
    Related Search
    We Need Your Support
    Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

    Thanks to everyone for your continued support.

    No, Thanks