Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pair is at least 80% slower than using an Array 2-tuplet #50

Open
dotnetCarpenter opened this issue Apr 28, 2022 · 9 comments
Open

Pair is at least 80% slower than using an Array 2-tuplet #50

dotnetCarpenter opened this issue Apr 28, 2022 · 9 comments

Comments

@dotnetCarpenter
Copy link

dotnetCarpenter commented Apr 28, 2022

I had an issue with a super simple parser that took seconds to parse simple objects. After profiling, I found the bottleneck to be Pair. Changing from Pair to an Array Tuple made my use-case go from seconds to 40ms!

I made a reduced test case here: https://dotnetcarpenter.github.io/common-scandi-names/performance.html

All tests and profiles I have made are with type checking disabled.

Using S.Pair is at least 80% slower than using an Array 2-tuplet, in this simple parsing test using S.reduce. Where tokens are https://gist.github.com/dotnetCarpenter/a52ff00f8d9a1cf1fcff23114a0fabea#file-norwegian-tokens-js.

const TOKEN = Object.freeze ({
  PARENT: 8,
  CHILD: 16,
})

// Token = { value: Any, type: Number }

//    parse :: Pair Ast (Array a) -> Token -> Pair Ast (Array a)
const parse = pair => token => {
  let ast = S.fst (pair), parent = S.snd (pair)

  switch (token.type) {
    case TOKEN.PARENT:
      parent = ast[token.value] = []
      break
    case TOKEN.CHILD:
      parent.push (token.value)
      break
    default:
      // skip token
  }

  return S.Pair (ast) (parent)
}

//    parser :: Array Token -> Ast
const parser = S.pipe ([
  timeStart ('parse'),
  S.reduce (parse) (S.Pair ({}) ([])),
  S.fst,
  timeEnd ('parse'),
])

parser (tokens)

You should be able to copy/paste the code from this gist into nodejs and see the difference (the two data sets are also in the same gist.

Profile
Even with the mangled files names due to rollup/vite bundling, it is still easy to spot where the majority of time is spent.

  function Pair(fst) {
    return snd => {
      const pair = Object.create (prototype);
      if (Z.Setoid.test (fst) && Z.Setoid.test (snd)) {
        pair['fantasy-land/equals'] = Pair$prototype$equals;
        if (Z.Ord.test (fst) && Z.Ord.test (snd)) {
          pair['fantasy-land/lte'] = Pair$prototype$lte;
        }
      }
      if (Z.Semigroup.test (fst)) {
        if (Z.Semigroup.test (snd)) {
          pair['fantasy-land/concat'] = Pair$prototype$concat;
        }
        pair['fantasy-land/ap'] = Pair$prototype$ap;
        pair['fantasy-land/chain'] = Pair$prototype$chain;
      }
      pair.fst = fst;
      pair.snd = snd;
      return pair;
    };
  }

Pair does rigorous testing regardless of type checking mode, in order to know which ADT's to implement. A fix would need to make sure that these tests will still function.

@dotnetCarpenter
Copy link
Author

I haven't checked if there are any checks that should have been disabled but is not, when type checking is disabled. But I suspect that is not the case.

There is a stage 2 proposal, JavaScript Records & Tuples Proposal, that might be leverage to gain speed while ensuring constraints. I will need to look into that.

Here is an explanation of that proposal: https://2ality.com/2020/05/records-tuples-first-look.html

@Avaq
Copy link
Member

Avaq commented Apr 28, 2022

I think this problem generalizes to all Sanctuary ADTs with conditional type class instances. That's most of them, I guess.

@dotnetCarpenter
Copy link
Author

Would it be possible to detect if check types is false and then have a fast path in that circumstance?

@Avaq
Copy link
Member

Avaq commented Apr 28, 2022

I'm guessing no, because it would alter the behaviour of the code if checkTypes is false. It's doing the deep value inspection because it needs to figure out if it can attach a method to the container. And other parts of the code check for the existence of this method to determine how to dispatch certain operations.

@dotnetCarpenter
Copy link
Author

Looks like we can loose the same type test in Z.lte since it is already done for Setoid / Z.equals and an Ord must be a Setoid, right?
If yes, then there might be more tests that can be dropped due to the type classes hierarchy.

@dotnetCarpenter
Copy link
Author

I tried to change the constructor to see if I could speed up the validation by being less thorough. There was literally no difference!

Original: parse: 6.758s
Fast path: parse: 6.462s

  const fastPath = fst => snd => {
    const pair = Object.create (prototype);

    if (typeof fst === typeof snd) {
      pair['fantasy-land/equals'] = Pair$prototype$equals;

      if ( !(fst instanceof Function) && !(snd instanceof Function) ) {
        pair['fantasy-land/lte'] = Pair$prototype$lte;
      }
    }

    if (fst.concat instanceof Function || typeof fst === 'object') {
      if (snd.concat instanceof Function || typeof snd === 'object') {
        pair['fantasy-land/concat'] = Pair$prototype$concat;
      }

      pair['fantasy-land/ap'] = Pair$prototype$ap;
      pair['fantasy-land/chain'] = Pair$prototype$chain;
    }

    pair.fst = fst;
    pair.snd = snd;

    return pair;
  }

@dotnetCarpenter
Copy link
Author

After taking another look at the profile image, I now suspect that it is concat that is being slow. Another very strange behavior I just noticed is that if I use unchecked then I get 1-2 second speedup!

S.unchecked.reduce (parse) (S.Pair ({}) ([])),

I suspect that process.env.NODE_ENV = 'production' is not working as intended.

@dotnetCarpenter
Copy link
Author

Apparently process.env.NODE_ENV = 'production' and S.unchecked is not the same in nodejs while both have the same run time performance in the browser.

In a browser: https://dotnetcarpenter.github.io/common-scandi-names/performance.html
Source: https://github.com/dotnetCarpenter/common-scandi-names/blob/a6d92bfa3787653a14ba4385a4efe2d0c6c5f9c6/vite/pair-performance-issue.js

In nodejs 18.0:

$ node dev/pair-performance-issue.js
The test run will start in 2 seconds...
parse_: 1.590s
parse2: 4.185ms
parse: 6.243s

Source: https://github.com/dotnetCarpenter/common-scandi-names/blob/a6d92bfa3787653a14ba4385a4efe2d0c6c5f9c6/dev/pair-performance-issue.js

@dotnetCarpenter
Copy link
Author

I'm might revisit this later to see if there is any checking that can be omitted, since it does not appear to be the checks in the Pair constructor. But I'm inclined to give up and close this. Now I know that in any hot loop, I will need to use js native containers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants