Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add field comparable to firstHtml to the har.request tables #21

Open
rviscomi opened this issue Aug 28, 2017 · 2 comments
Open

Add field comparable to firstHtml to the har.request tables #21

rviscomi opened this issue Aug 28, 2017 · 2 comments

Comments

@rviscomi
Copy link
Member

The runs.request tables include a firstHtml field to indicate that the request is for the parent document.

Queries on the har.request tables must join on the corresponding runs table to get this info. There are tens of millions of requests in each table, so the join is expensive.

To simplify queries and make them less expensive, add a boolean field comparable to firstHtml to the har.request tables. It should share the same logic as the runs table; first 200 response with HTML mime type.

@igrigorik
Copy link
Collaborator

Would this be a step in the Dataflow pipeline?

@rviscomi
Copy link
Member Author

Yes, it should be annotated during the iteration over the requests in the HAR file:

This field would also be valuable on the har.bodies tables.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants