Accept-Language comparing standard currently differs between `getLanguagePriority` and `compareSpecs` #54

tangye1234 · 2019-03-13T07:49:07Z

Let's make the request

Accept-Language: zh, zh-CN;q=0.9

and
We provided backlist: ['zh-CN', 'zh-TW']

Coding:

const n = new require('negotiator')({ headers: { 'accept-language': 'zh, zh-CN;q=0.9' } })
n.language(['zh-CN', 'zh-TW'])

Actual:
'zh-TW'

Expect:
'zh-CN' for sure

Analysis:
In the code, we get two comparing code snippets:

function getLanguagePriority(language, accepted, index) {
    var priority = {
        o: -1,
        q: 0,
        s: 0
    };

    for (var i = 0; i < accepted.length; i++) {
        var spec = specify(language, accepted[i], index);

        // this means `s`(the matching level) > `q`(the quality) > `o`(the index of accepted)
        if (spec && (priority.s - spec.s || priority.q - spec.q || priority.o - spec.o) < 0) {
            priority = spec;
        }
    }

    return priority;
}

But when codes come here

function compareSpecs(a, b) {
    // `q`(the quality) > `s`(the matching level) > `o`(the accepted index) > `i`(comparing index)
    return (b.q - a.q) || (b.s - a.s) || (a.o - b.o) || (a.i - b.i) || 0;
}

WHICH LEADS:
when comparing zh-CN to accepted zh, zh-CN;q=0.9,
we get a high matching level zh-CN with a lower quality weight 0.9

and then comparing zh-TW to accepted zh, zh-CN;q=0.9,
we get the the only half-matched zh with a high quality weight 1

finally, we use compareSpecs to select a higher quality language zh-TW rather than the lower but 100% percent matched one zh-CN

Read the rfc4647, it is not defined whether this expectation should be reasonable.
It only reads:

If the user cannot be sure which scheme is being used (or
if more than one might be applied to a given request), the user
SHOULD specify the most specific (largest number of subtags) range
first and then supply shorter prefixes later in the list to ensure
that filtering returns a complete set of tags.

So the accepted language with zh, zh-CN;q=0.9 is not comformed to this user decision.
But I think, the comparing logic should be the same ( q > s > o), such that

n.languages(['zh-CN', 'zh-HK']) // returns ['zh-CN', 'zh-HK']
n.languages(['zh-HK', 'zh-CN']) // returns ['zh-HK', 'zh-CN']

The text was updated successfully, but these errors were encountered:

julianlam · 2019-03-15T16:34:24Z

How coincidental, I just ran into this issue as well.

var Negotiator = require('negotiator');
var req = { headers: { 'accept-language': 'en-CA,en;q=0.9,en-GB;q=0.8,en-US;q=0.7,fr;q=0.6,pt;q=0.5,th;q=0.4' } }
var negotiator = new Negotiator(req);
negotiator.languages(['en-GB', 'en-US', 'en-x-pirate']);
[ 'en-x-pirate', 'en-GB', 'en-US' ]   // output

In my scenario the en should match en-GB first at priority of 0.8. en substring match with en-x-pirate should be considered only after all exact matches have been exhausted.

julianlam · 2019-03-15T16:40:59Z

@tangye1234 #46 provides a bit more context, and @dougwilson has shown that this module adheres to RFC 4647, so this unfortunately is a wontfix

tangye1234 · 2019-03-19T05:03:39Z

@julianlam, I think @dougwilson is trying to explain the case Apache mod_negotiation and jshttp/negotiator behave the same, but that is not what rfc4647 describes, the RFC only describes that in this case language cannot be selected out, but only be filtered with a fallback list, while users would decide for their own.

Maybe he is right because the negotiator cannot make decisions for the user, unless user send another accept-language list: en-GB, en;q=0.9 that follows the the suggestion from RFC:

the user SHOULD specify the most specific (largest number of subtags) range
first and then supply shorter prefixes later in the list

dougwilson · 2019-03-19T05:17:02Z

Hi everyone, sorry I have been away with other things.

For Accept-Language header, this module specifically only implements the algo linked to from the HTTP RFC (https://tools.ietf.org/html/rfc7231#section-5.3.5). To quote from that section of RFC 7231:

For matching, Section 3 of [RFC4647] defines several matching
schemes. Implementations can offer the most appropriate matching
scheme for their requirements. The "Basic Filtering" scheme
([RFC4647], Section 3.3.1) is identical to the matching scheme that
was previously defined for HTTP in Section 14.4 of [RFC2616].

The "Basic Filtering" algo is what this module currently implements for Accept-Language header, as it is the algo web servers have always traditionally implemented for selecting a language from a set list for HTTP applications. The full test of this algo is in RFC 4647 here: https://tools.ietf.org/html/rfc4647#section-3.3.1

How the basic filter algo works is also illustrated in a previous comment I made here #46 (comment) which applies to the original post as well.

I'm not really familiar with the ago in RFC 4647 that would produce the expected results in the original post. @tangye1234 would you be able to link directly to the section in RFC 4647 that defines the algo you're expected and than use your example to walk though it? I'm 100% open to adding additional defined algos to the list such that you can choose which one to use when matching language tags 👍 the PR above was changing the existing standards-defined algo which is why it was marked as wontfix, though adding additional algos from the RFC as options to alter behavior is perfectly acceptable.

I hope this makes sense.

julianlam · 2019-04-05T18:19:46Z

Assuming we want to define additional algos to the call to negotiator, how would this be achieved, as a second argument for .languages?

julianlam mentioned this issue Mar 15, 2019

Seemingly normal Accept-Language header resolves to Pirate English NodeBB/NodeBB#7458

Closed

dougwilson added the awaiting more info label Mar 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accept-Language comparing standard currently differs between `getLanguagePriority` and `compareSpecs` #54

Accept-Language comparing standard currently differs between `getLanguagePriority` and `compareSpecs` #54

tangye1234 commented Mar 13, 2019 •

edited

julianlam commented Mar 15, 2019 •

edited

julianlam commented Mar 15, 2019

tangye1234 commented Mar 19, 2019

dougwilson commented Mar 19, 2019

julianlam commented Apr 5, 2019

Accept-Language comparing standard currently differs between getLanguagePriority and compareSpecs #54

Accept-Language comparing standard currently differs between getLanguagePriority and compareSpecs #54

Comments

tangye1234 commented Mar 13, 2019 • edited

julianlam commented Mar 15, 2019 • edited

julianlam commented Mar 15, 2019

tangye1234 commented Mar 19, 2019

dougwilson commented Mar 19, 2019

julianlam commented Apr 5, 2019

Accept-Language comparing standard currently differs between `getLanguagePriority` and `compareSpecs` #54

Accept-Language comparing standard currently differs between `getLanguagePriority` and `compareSpecs` #54

tangye1234 commented Mar 13, 2019 •

edited

julianlam commented Mar 15, 2019 •

edited