Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

retention: save dates #1

Open
marg51 opened this issue Apr 30, 2015 · 0 comments
Open

retention: save dates #1

marg51 opened this issue Apr 30, 2015 · 0 comments

Comments

@marg51
Copy link
Owner

marg51 commented Apr 30, 2015

draft

Goal:

Cohort analysis -› retention rate …

  • What percentage of users with the feature A enabled came back the next week of their signup
  • After how many weeks our mobile users made their first purchase

We need to have access to two things : acquisition date and event date

  • acquisition date can be sign up, first visit, first visit with a specific device, …
  • event date can be purchases with date for each

We need to be able to filter events

  • app versions
  • feature enabled
  • browsers
  • location of user

The retention rate will be calculated this way :

  • filter events
  • aggregate events (for every browsers, app version ...)
  • histogram on the diff between acquisition date and event date

Using flat document.

# we keep only events for the browser Chrome
query:
    filtered:
        filter:
            terms:
                browser.family: "Chrome"

# we split our results using the app version
aggs:
    appversion:
        terms:
            field: "appVersion"
       # and we calculate the number of days between the signup and the first buy
        aggs:
            firstBuy:
                histogram:
                    script: "Days.daysBetween(doc['signup_date'], doc['first_buy_date']).getDays()"
                    interval: "7"

Using a flat document has the advantages to be fast at execution time and easy to write the query (good point for Kibana)
However, it has a few drawbacks: slow to index, consume more memory and not really future proof.

The main problem is that it's not future proof. Since everything is stored inside a unique document, if you want to add useful data into a session, you would have to add it as well into this document and everything not in this document is not available at all.


Proposed schema for flat document

{
    "properties" : {
      "type": {
        "type": "string",
        "index": "not_analyzed"
      },
      "name": {
        "type": "string",
        "index": "not_analyzed"
      },
      "app": {
        "type": "object",
        "properties": {
          "name": {
            "type": "string",
            "index": "not_analyzed"
          },
          "version": {
            "type": "string",
            "index": "not_analyzed"
          },
          "features": {
            "type": "object"
          }
        }
      },
      "user": {
        "type": "object",
        "properties": {
          "userId": {
            "type": "long",
            "index": "not_analyzed"
          }
        }
      },
      "device": {
        "type": "object",
        "properties": {
          "browser": {
            "type": "object",
            "properties": {
              "family": {
                "type": "string",
                "index": "not_analyzed"
              },
              ....
            }
          },
          "device": {
            "type": "object",
            "properties": {
              "family": {
                "type": "string",
                "index": "not_analyzed"
              },
              ....
            }
          },
          "os": {
            "type": "object",
            "properties": {
              "family": {
                "type": "string",
                "index": "not_analyzed"
              },
              ....
            }
          }
        }
      },
      "acquisitions": {
        "type": "object",
        "properties": {
          "first_visit": {
            "type": "date"
          }, "first_device_visit": {
            "type": "date"
          }, "signup": {
            "type": "date"
          },
          ...
        }
      }
    }
  }
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant