Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accept-Language comparing standard currently differs between getLanguagePriority and compareSpecs #54

Open
tangye1234 opened this issue Mar 13, 2019 · 5 comments

Comments

@tangye1234
Copy link

tangye1234 commented Mar 13, 2019

Let's make the request

Accept-Language: zh, zh-CN;q=0.9

and
We provided backlist: ['zh-CN', 'zh-TW']

Coding:

const n = new require('negotiator')({ headers: { 'accept-language': 'zh, zh-CN;q=0.9' } })
n.language(['zh-CN', 'zh-TW'])

Actual:
'zh-TW'

Expect:
'zh-CN' for sure

Analysis:
In the code, we get two comparing code snippets:

function getLanguagePriority(language, accepted, index) {
    var priority = {
        o: -1,
        q: 0,
        s: 0
    };

    for (var i = 0; i < accepted.length; i++) {
        var spec = specify(language, accepted[i], index);

        // this means `s`(the matching level) > `q`(the quality) > `o`(the index of accepted)
        if (spec && (priority.s - spec.s || priority.q - spec.q || priority.o - spec.o) < 0) {
            priority = spec;
        }
    }

    return priority;
}

But when codes come here

function compareSpecs(a, b) {
    // `q`(the quality) > `s`(the matching level) > `o`(the accepted index) > `i`(comparing index)
    return (b.q - a.q) || (b.s - a.s) || (a.o - b.o) || (a.i - b.i) || 0;
}

WHICH LEADS:
when comparing zh-CN to accepted zh, zh-CN;q=0.9,
we get a high matching level zh-CN with a lower quality weight 0.9

and then comparing zh-TW to accepted zh, zh-CN;q=0.9,
we get the the only half-matched zh with a high quality weight 1

finally, we use compareSpecs to select a higher quality language zh-TW rather than the lower but 100% percent matched one zh-CN

Read the rfc4647, it is not defined whether this expectation should be reasonable.
It only reads:

If the user cannot be sure which scheme is being used (or
if more than one might be applied to a given request), the user
SHOULD specify the most specific (largest number of subtags) range
first and then supply shorter prefixes later in the list to ensure
that filtering returns a complete set of tags.

So the accepted language with zh, zh-CN;q=0.9 is not comformed to this user decision.
But I think, the comparing logic should be the same ( q > s > o), such that

n.languages(['zh-CN', 'zh-HK']) // returns ['zh-CN', 'zh-HK']
n.languages(['zh-HK', 'zh-CN']) // returns ['zh-HK', 'zh-CN']
@julianlam
Copy link

julianlam commented Mar 15, 2019

How coincidental, I just ran into this issue as well.

var Negotiator = require('negotiator');
var req = { headers: { 'accept-language': 'en-CA,en;q=0.9,en-GB;q=0.8,en-US;q=0.7,fr;q=0.6,pt;q=0.5,th;q=0.4' } }
var negotiator = new Negotiator(req);
negotiator.languages(['en-GB', 'en-US', 'en-x-pirate']);
[ 'en-x-pirate', 'en-GB', 'en-US' ]   // output

In my scenario the en should match en-GB first at priority of 0.8. en substring match with en-x-pirate should be considered only after all exact matches have been exhausted.

@julianlam
Copy link

@tangye1234 #46 provides a bit more context, and @dougwilson has shown that this module adheres to RFC 4647, so this unfortunately is a wontfix

@tangye1234
Copy link
Author

@julianlam, I think @dougwilson is trying to explain the case Apache mod_negotiation and jshttp/negotiator behave the same, but that is not what rfc4647 describes, the RFC only describes that in this case language cannot be selected out, but only be filtered with a fallback list, while users would decide for their own.

Maybe he is right because the negotiator cannot make decisions for the user, unless user send another accept-language list: en-GB, en;q=0.9 that follows the the suggestion from RFC:

the user SHOULD specify the most specific (largest number of subtags) range
first and then supply shorter prefixes later in the list

@dougwilson
Copy link
Contributor

Hi everyone, sorry I have been away with other things.

For Accept-Language header, this module specifically only implements the algo linked to from the HTTP RFC (https://tools.ietf.org/html/rfc7231#section-5.3.5). To quote from that section of RFC 7231:

For matching, Section 3 of [RFC4647] defines several matching
schemes. Implementations can offer the most appropriate matching
scheme for their requirements. The "Basic Filtering" scheme
([RFC4647], Section 3.3.1) is identical to the matching scheme that
was previously defined for HTTP in Section 14.4 of [RFC2616].

The "Basic Filtering" algo is what this module currently implements for Accept-Language header, as it is the algo web servers have always traditionally implemented for selecting a language from a set list for HTTP applications. The full test of this algo is in RFC 4647 here: https://tools.ietf.org/html/rfc4647#section-3.3.1

How the basic filter algo works is also illustrated in a previous comment I made here #46 (comment) which applies to the original post as well.

I'm not really familiar with the ago in RFC 4647 that would produce the expected results in the original post. @tangye1234 would you be able to link directly to the section in RFC 4647 that defines the algo you're expected and than use your example to walk though it? I'm 100% open to adding additional defined algos to the list such that you can choose which one to use when matching language tags 👍 the PR above was changing the existing standards-defined algo which is why it was marked as wontfix, though adding additional algos from the RFC as options to alter behavior is perfectly acceptable.

I hope this makes sense.

@julianlam
Copy link

Assuming we want to define additional algos to the call to negotiator, how would this be achieved, as a second argument for .languages?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants