Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a TTL model #2475

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Add a TTL model #2475

wants to merge 1 commit into from

Conversation

str4d
Copy link
Contributor

@str4d str4d commented Jul 10, 2023

This unifies the logic for handling the dnscontrol-default TTL, and frees up the TTL value 0 for use by providers (e.g. Linode, which uses it as its sentinel for its default TTL).

Closes #2444.

@str4d
Copy link
Contributor Author

str4d commented Jul 10, 2023

With this change, I am able to fully resolve #2440 by adding an explicit DefaultTTL(0) to all of my DSP_LINODE sections. It would be slightly cleaner if providers could override the global DefaultTTL for themselves, but that would be a more complex change to the API, whereas just having models.TTL differentiate between "unset" and "set to 0" is simpler to reason about.

It will likely be necessary for providers to be checked for previously-hidden assumptions about the value 0; before when parsing this from the provider into dnscontrol it would be silently coerced to models.DefaultTTL in other places, whereas now 0 is preserved unless explicitly checked for by the provider and replaced by models.EmptyTTL().

Comment on lines -272 to +273
if rc.TTL != 0 {
ttl := int64(rc.TTL)
if rc.TTL.Value() != 0 {
ttl := int64(rc.TTL.Value())
Copy link
Contributor Author

@str4d str4d Jul 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an example of where a provider was explicitly checking for 0, in a place where it could never be reached (as dnscontrol always replaced 0 with models.DefaultTTL before reaching this point). I presumed that this check was intentional and thus left it in place (as it is now triggerable).

@@ -277,7 +277,7 @@ func toReq(dc *models.DomainConfig, rc *models.RecordConfig) (*recordEditRequest
Type: rc.Type,
Name: rc.GetLabel(),
Target: rc.GetTargetField(),
TTL: int(rc.TTL),
TTL: int(rc.TTL.Value()),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what fully fixes #2440:

  • When no TTL is set, this evaluates to models.DefaultTTL (300).
  • When DefaultTTL(0) is used explicitly, this evaluates to 0.

@TomOnTime
Copy link
Collaborator

Hi there! I'm on vacation so please expect delays.

Thank you for taking the time to work on this issue. The proposed change is pretty fundamental and will require extensive discussion and testing.

I have a number of over-arching concerns:

  1. I have a concern about code complexity. Would it be less complex to declare TTL=0 to be the default? It is not a valid TTL, thus we should be able to use it. This would eliminate a lot of the complexity especially the use of pointers.
  2. Do we want to support a default TTL at all? Every vendor has a different default TTL. Therefore depending on the default makes the zone non-portable to other providers. It also makes the use of dual-hosted domains more difficult. I could be convinced that zone portability is rare and insignificant. anyway... maybe we should simply change any "default" to (for example) 1 day; the next "push" would remove the use of default from the zone. Then all our our domain data is explicit.

Code-specific suggestions:

  1. Rather than add to dns.go, please move the TTL-related code to models/ttl.go
  2. ValueRef(): Is there a better term for this?
  3. A lot of tests are breaking, especially for BIND.
    cd integrationTest && go test -v -verbose -provider BIND

@str4d
Copy link
Contributor Author

str4d commented Jul 22, 2023

Thank you for taking the time to work on this issue. The proposed change is pretty fundamental and will require extensive discussion and testing.

Yep, as expected! I will say that I'm not claiming this PR is the optimal approach (I am not a Go or JS developer, and have fumbled my way into something that compiles and works with a specific provider), but my hope is that it helps us to coalesce around a viable approach.

I have a number of over-arching concerns:

  1. I have a concern about code complexity. Would it be less complex to declare TTL=0 to be the default? It is not a valid TTL, thus we should be able to use it. This would eliminate a lot of the complexity especially the use of pointers.

In this PR I didn't do that because I was trying to preserve dnscontrol's global default (to not affect the existing user API) while also enabling the full unsigned integer range to be used for per-provider TTL values (as I have not surveyed what values providers actually use for different settings).

If the global dnscontrol default were either removed, or set up to be the fallback if an individual provider doesn't have its own default, and we confirmed that TTL=0 isn't used as any other kind of sentinel by any provider, then yeah using TTL=0 for this would simplify the implementation.

As an aside, my first local attempt at this PR did not use pointers, instead opting for an Option-based approach (I'm a Rust developer 🦀) via a third party dependency that dnscontrol already depended on elsewhere. However, one single provider within dnscontrol required that the TTL value could itself be referenced via a pointer, which was not possible for the non-pointer Option type. I did not spend the time to figure out if that was a hard requirement for the provider, and instead switched to using a pointer inside an explicit TTL model (the latter being a better approach anyway).

  1. Do we want to support a default TTL at all? Every vendor has a different default TTL. Therefore depending on the default makes the zone non-portable to other providers. It also makes the use of dual-hosted domains more difficult. I could be convinced that zone portability is rare and insignificant. anyway... maybe we should simply change any "default" to (for example) 1 day; the next "push" would remove the use of default from the zone. Then all our our domain data is explicit.

I personally don't particularly care whether you want to support a default TTL or not. My own issue is "solved" by this PR, in that I'm wanting to move my DNS settings away from Linode, and all I wanted was to confirm that my dnsconfig.js was correct (the existing diff view makes it very hard to verify this). So I don't actually need this PR to be merged.

However, there doesn't appear to be a clear decision one way or the other regarding default TTLs, and the implicit decision to map 0 to 300 instead of having separate "unconfigured" or "default" states causes problematic interactions with individual providers (e.g. #2440). Even if there isn't support for any "default" TTL in the model, I think having a TTL model (and a way to tell when the user is using it) is useful.

Code-specific suggestions:

  1. Rather than add to dns.go, please move the TTL-related code to models/ttl.go

Done.

  1. ValueRef(): Is there a better term for this?

IDK, I can go with whatever is standard Go naming convention here. In Rust we use *_ref suffixes to denote APIs that return immutable references (as opposed to owned values).

  1. A lot of tests are breaking, especially for BIND.
    cd integrationTest && go test -v -verbose -provider BIND

Thanks, I couldn't figure out how to run the tests. It took me a bit to figure out that I actually had to set an environment variable to get this to run:

cd integrationTest && BIND_DOMAIN=example.com go test -v -verbose -provider BIND

And the test failure was because prettyzone was trying to print the TTL model directly instead of TTL.Value(), and so was writing the address of the pointer into the zone file 😅 Fixed.

Force-pushed with changes 1 and 3.

This unifies the logic for handling the dnscontrol-default TTL, and
frees up the TTL value 0 for use by providers (e.g. Linode, which uses
it as its sentinel for its default TTL).

Closes StackExchange#2444.
@tlimoncelli
Copy link
Contributor

I appreciate all the effort you've put into this. While this solution will work, I think there are other solutions that are more simple that will also work. I always favor the more simple solution.

I don't think we can solve this globally until first we decide how we want TTLs to work. Most of the TTL-handling code was written in 2016-ish when DNSControl had fewer formal design specifications (translation: we were guessing our way through most design decisions and a lot of specifics were just implemented without a formal design). As a result, each provider handles things slightly differently. Eventually we should do an RFC and get feedback from the community about how TTLs should work, get consensus, then implement a solution based on that feedback.

In the meanwhile, I think we should fix this just for Linode rather than doing a global change that affects nearly all providers.

Therefore, I'm asking you to start a different PR that fixes this for Linode only.

Thanks for understanding.

@str4d
Copy link
Contributor Author

str4d commented Sep 9, 2023

I appreciate all the effort you've put into this. While this solution will work, I think there are other solutions that are more simple that will also work. I always favor the more simple solution.

@tlimoncelli If you know of a simpler solution, I am all ears!

The core problem is the aliasing everywhere of TTL = 0; this needs to be addressed somehow. The JS model uses it to mean both "nothing configured" and "user explicitly configured 0:

function newDomain(name, registrar) {
return {
name: name,
subdomain: '',
registrar: registrar,
meta: {},
records: [],
recordsabsent: [],
dnsProviders: {},
defaultTTL: 0,

// DefaultTTL(v): Set the default TTL for the domain.
function DefaultTTL(v) {
if (_.isString(v)) {
v = stringToDuration(v);
}
return function (d) {
d.defaultTTL = v;

dnscontrol/pkg/js/helpers.js

Lines 1048 to 1051 in af91e37

var record = {
type: type,
meta: {},
ttl: d.defaultTTL,

// TTL(v): Set the TTL for a DNS record.
function TTL(v) {
if (_.isString(v)) {
v = stringToDuration(v);
}
return function (r) {
r.ttl = v;

The Go model dns.RecordConfig is parsed directly from the JS model:

type RecordConfig struct {
Type string `json:"type"` // All caps rtype name.
Name string `json:"name"` // The short name. See above.
SubDomain string `json:"subdomain,omitempty"`
NameFQDN string `json:"-"` // Must end with ".$origin". See above.
target string // If a name, must end with "."
TTL uint32 `json:"ttl,omitempty"`

dns.RecordConfig is used directly by every single provider, so any solution that alters the type of TTL (as this PR does) is by definition a global solution. Your request therefore prevents me from doing this.

Additionally, individual providers never see RecordConfig.TTL = 0, because of normalize.ValidateAndNormalizeConfig:

// Normalize Records.
models.PostProcessRecords(domain.Records)
for _, rec := range domain.Records {
if rec.TTL == 0 {
rec.TTL = models.DefaultTTL
}

Delaying or changing this normalization would require changes to every single provider to handle the fact that their assumptions about default TTL handling have been broken. So I can't alter that either if I am to satisfy your request.

As a result of the above two implementation details, there is currently no way for the Linode provider to distinguish these three cases:

  • No setting, falling back on dnscontrol's global default TTL of 300 (changing this to instead fall back on the provider's default TTL is something I presume would be part of the RFC you describe).
  • The user explicitly configuring DefaultTTL(0) (which should always be interpreted as a valid Linode TTL setting).
  • The user explicitly configuring DefaultTTL(300).

I'll be honest, I do not see a simpler solution than "change the type of TTL". I see (what I believe to be) a more complex solution that allows for an incremental migration, however:

  • Add a new TTL field to dns.RecordConfig with a different name (say, TypedTTL).
  • Connect TypedTTL to the JS model in one of two ways:
    • Add similar duplicate fields to the JS model, and handle user updates to them in parallel with the existing defaultTTL and ttl fields.
    • Change the JS model to be nullable (like this PR) and then change the Go parser to parse the ttl field into TypedTTL, and then copy the parsed value into the existing TTL field (aliasing None and Some(0) to 0).
  • Change every single location in the codebase that modifies dns.RecordConfig.TTL to also modify TypedTTL, except for normalize.ValidateAndNormalizeConfig (which would leverage TypedTTL's None case as meaning "use the dnscontrol global default").
    • Hopefully no providers are modifying TTL so I don't have to modify their code.
  • Change the Linode provider to use TypedTTL instead of TTL.
    • All other providers are left using TTL as-is.

If this approach is along the lines of what you were thinking, then I'll go ahead and implement it. And if there is some aspect of dnscontrol that I'm completely overlooking (as a non-Go developer or as a new contributor) that would make this way simpler, I'd love to know.

@str4d str4d marked this pull request as draft September 14, 2023 18:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DNSControl treats a configured TTL of 0 as "not configured"
3 participants