From: Activity-based Twitter sampling for content-based and user-centric prediction models
Notations | Descriptions |
---|---|
N | Total number of documents |
M | Total number of users, \(1<m<M\) |
V | Global vocabulary |
w | Word in vocabulary |
\({u}_i\) | ith user out of m |
\({s}_u\) | Sentiment score belongs to user (u) |
q | Aggregation window |
\({y}_i\) | Crime rate at time t(i) |
\({\Delta }r\) | Lag between a document and a target trend |
\({p}_i\) | A post tweeted at time t(i) |
\({d}_i\) | A document sampled at time t(i) |
\({X}^{\langle c \rangle }\) | Document term matrix of size \(N*|V|\) sparse matrix |
\({X}^{\langle u \rangle }\) | Document sentiment matrix of size \(N*M\) sparse matrix |