Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Display transform function for full names #805

Open
torfjelde opened this issue Oct 10, 2023 · 15 comments
Open

Display transform function for full names #805

torfjelde opened this issue Oct 10, 2023 · 15 comments
Labels
enhancement New feature or request

Comments

@torfjelde
Copy link

Is your feature request related to a problem? Please describe.
I often find myself wanting to search through my bibliographies using the first name. This might just be because I remember the first name but not the last name, or there are too many authors with the same last name.

Describe the solution you'd like
I'm thinking something like (basically just a copy-paste of citar--shorten-names):

;; Similar to `citar--shorten-name' and `citar--shorten-names'
;; but using the full name instead of just the last name.
(defun citar--full-name (name)
  "Return full NAME, i.e. `given family`, from `family, given' string.

Otherwise, return as is."
  (let (split-names (split-string name ", "))
    (if (> (length split-names) 1)
        (concat (cadr split-names) " " (car split-names))
      name)))

(defun citar--full-names (namestr &optional truncate andstr)
  "Return a list of full names from a list of full NAMESTR.

With an integer TRUNCATE, will shorten the list, and ANDSTR will
replace last comma."
  (let* ((namelist (split-string (or namestr "") " and "))
         (namelength (length namelist))
         (tnamelist (seq-take namelist (or truncate namelength)))
         (tnamelength (length tnamelist)))
    (mapconcat
     (lambda (n)
       (let* ((shortname (citar--full-name n))
              (pos (citar--shorten-name-position tnamelist n))
              (suffix
               (cond
                ;; if last name in the list and we're truncating add et al.; otherwise, no suffix
                ((equal pos tnamelength)
                 (if (< tnamelength namelength) " et al." ""))
                ;; if second to last in the list, and ANDSTR, use that
                ((and andstr (equal pos (- tnamelength 1)))
                 (concat " " andstr " "))
                ;; otherwise, use a comma
                (t ", "))))
         (concat shortname suffix)))
     tnamelist "")))

in combination with

(setq citar-display-transform-functions
   `((sn . (citar--shorten-names))
     (etal . (citar--shorten-names 3 "&"))
     (fn . (citar--full-names))))

so I can use this in templates with the transform key fn.

Describe alternatives you've considered
One alternative is to "maintain" the usage of "and" in the author list, but IMO this becomes quite cluttered when searching, and so I prefer the "given1 family1, given2 family2, ..." instead of `given1 family1 and given2 family2 and ...".

@torfjelde torfjelde added the enhancement New feature or request label Oct 10, 2023
@bdarcus
Copy link
Contributor

bdarcus commented Oct 10, 2023

Given this behavior is already configurable (just create your own function and use that in your templates), is there really any point in changing the default function there?

Or maybe I don't understand what you're asking for.

@torfjelde
Copy link
Author

Of course, that is also completely reasonable:)

But given that the sn and the etal transform functions are already there, it seems like citar.el thinks it is useful to provide some "default" transform functions, right? And so I think it's just a question of whether the full name transformation function should be considered one of these?

Personally I think the answer is yes (because I use it myself 😅 ) but I completely understand if that opinion is not shared. Nonetheless, I thought it would be useful to open an issue so other people who desired the same could potentially discover a "solution":)

@bdarcus
Copy link
Contributor

bdarcus commented Oct 21, 2023

OK, so you're not asking to change the default, but to add another option?

I don't see any particular problem doing that. Can you submit a PR?

https://github.com/emacs-citar/citar/blob/main/CONTRIBUTING.org

@torfjelde
Copy link
Author

torfjelde commented Oct 21, 2023 via email

@bdarcus
Copy link
Contributor

bdarcus commented Oct 21, 2023

One design option worth considering (I'm not myself sure):

Perhaps citar--shorten-names should be generalized, where this becomes a parameter on that?

Whether or not to do that probably comes down to the clarity and maintainability of the code, which is why I'm not sure ATM.

torfjelde added a commit to torfjelde/citar that referenced this issue Oct 23, 2023
Add new display transform function for full names.

Generalize existing `citar--shorten-names` to `citar--render-names`
to allow code-sharing between completion rendering of family names
and full names.

Refs: emacs-citar#805
torfjelde added a commit to torfjelde/dotfiles that referenced this issue Oct 25, 2023
@aikrahguzar
Copy link
Contributor

I think a better way to solve this problem would be allow display transform functions to make some part of the string invisible. However, this would mess up the alignment since truncate-string-to-width doesn't take visibility into account. I experimented a bit and it is not too hard to write a version that takes into account invisible portions of the string when fitting it to a given width:

(defun citar-format--truncate-string-to-column (str end-column &optional ellipsis ellipsis-text-property)
  (with-current-buffer (get-buffer-create " *truncate-string*" t)
    (insert str)
    (if (<= (current-column) end-column)
        (insert (make-string (- end-column (current-column)) ?\s))
      (when (not (stringp ellipsis))
        (setq ellipsis (truncate-string-ellipsis)))
      (setq end-column (- end-column (if ellipsis (string-width ellipsis) 0)))
      (let ((col (move-to-column end-column)))
        (unless (eq col end-column)
          (forward-char -1)
          (cl-callf concat ellipsis
            (make-string (- end-column col) ?\s))))
      (if ellipsis-text-property
          (put-text-property (point) (point-max) 'display (or ellipsis ""))
        (delete-region (point) (point-max))
        (insert ellipsis)))
    (delete-and-extract-region (point-min) (point-max))))

To see how performant it is, I evaluted

(defvar citar-format--test (string-split (buffer-substring-no-properties (point-min) (point-max)) "\n" t))

so that citar-format--test has lines of citar-format.el as list elements. Then I benchmarked,

(benchmark-run-compiled 1000
    (dolist (line citar-format--test)
      (truncate-string-to-width line 25 nil ?\s t t)))
;;  (1.477754321 6 0.9875454620000141)

(benchmark-run-compiled 1000
  (dolist (line citar-format--test)
    (citar-format--truncate-string-to-column line 25 t t)))
;; (3.7252026 6 0.9839875159999565)

So citar-format--truncate-string-to-column is about 2.5x slower than truncate-string-to-width. However most of it is buffer switching overhead due to with-current-buffer,

(benchmark-run-compiled 1000
  (with-current-buffer (get-buffer-create " *truncate-string*" t)
    (dolist (line citar-format--test)
      (citar-format--truncate-string-to-column line 25 t t))))
;; (1.647513596 6 0.9943548589999978)

i.e. wrapping each dolist to make the temporary buffer current makes the overhead go down to ~ 15%. This means that if wrap the formatting of all the candidates with this call, the overhead is not too high. However it does vary with the desired width,

| Width |   citar-format--truncate-string-to-column   |  truncate-string-to-width          |
|-------|---------------------------------------------|------------------------------------|
| 5     | (1.968308915 7 1.1616637719999972)          | (1.555034720 7 1.1581623760000070) |
| 10    | (1.988653394 7 1.1613642060000302)          | (1.565650172 7 1.1428396089999637) |
| 25    | (1.809252233 6 0.9900703910000175)          | (1.493039642 6 0.9859764029999951) |
| 50    | (1.501522882 5 0.8258050510000317)          | (1.410216259 5 0.8243931129999851) |
| 75    | (1.116250921 4 0.6578333830000247)          | (1.280658874 4 0.6582537240000192) |
| 100   | (1.086847893 4 0.6582314920000272)          | (1.270738270 4 0.6523561829999949) |

so citar-format--truncate-string-to-column goes from being about 33% slower when every string need to be truncated so about 20% faster when every string need to be padded.

So with that change it is possible to have a midway between full names and last names which is to put propertize first names with invisible property so that they appear in string but can still be searched.

@bdarcus
Copy link
Contributor

bdarcus commented Oct 28, 2023

I think a better way to solve this problem would be allow display transform functions to make some part of the string invisible.

I was wondering about this, though didn't have time to look into it.

@torfjelde - not sure you know this, but we already use a similar strategy for the default names when a long list of authors gets truncated. You can still search those candidates because the truncated part is still there.

Well, probably more precisely: this is true for any field, I believe. E.g. a long title that gets truncated.

It provides a nice balance of allowing the general case to be pretty compact, but still allowing more of the content to be completed against.

@roshanshariff
Copy link
Collaborator

roshanshariff commented Oct 28, 2023

I've been thinking about a potential change to how Citar generates and formats candidate strings that could help with this. In short, the completion system uses candidate strings only for matching against the user input, so the strings we generate needn't be formatted for display. They can just be concatenated, untruncated text for all the fields, including invisible text like full author names.

Then we actually format the text for display in the affixation function. This would truncate and ellipsize the text in each column, for which the column boundaries would have to be encoded as text properties.

The major advantage is that the affixation function is only called for displayed candidates, not for all possible completions (e.g. Vertico only calls it for the candidates it shows in the minibuffer). This opens the door for more complicated and expensive reformatting: for example, we could use variable-pitch text and then dynamically decide where to truncate and ellipsize it to fit the frame width, like vtable.el does for in-buffer text.

@aikrahguzar
Copy link
Contributor

I think a potential downside would be that affixation function would also need to highlight the candidate since losing all highlighting of matches would be too big a loss. It is probably not too hard to overcome: completion-all-completions or completion-try-completion with only one string in table will probably do it.

@bdarcus
Copy link
Contributor

bdarcus commented Oct 28, 2023

I think a potential downside would be that affixation function would also need to highlight the candidate since losing all highlighting of matches would be too big a loss.

If I understand you right, yes, this is the downside of annotation and affixation, and the primary reasons why the current approach.

@roshanshariff
Copy link
Collaborator

@aikrahguzar I was under the impression that the completion system does highlighting of the candidate string after it has been transformed by the affixation function? As long as the affixation function merely hides or replaces text using text properties, it shouldn't affect highlighting, right?

@aikrahguzar
Copy link
Contributor

@aikrahguzar I was under the impression that the completion system does highlighting of the candidate string after it has been transformed by the affixation function? As long as the affixation function merely hides or replaces text using text properties, it shouldn't affect highlighting, right?

Looking at the elisp manual, I don't know at what stage affixation function is called. But I think you are right, and my impression was from remembering annotation-function and not affixation-function.

@torfjelde
Copy link
Author

not sure you know this, but we already use a similar strategy for the default names when a long list of authors gets truncated. You can still search those candidates because the truncated part is still there.

Gotcha:) This was actually how I initially thought it was implemented and why I sort of "expected" to be able to search full names.

So do we then close this PR in favour of the invisibility-solution? Or do we try to get this merged first, and then invisibility-solution afterwards?

@bdarcus
Copy link
Contributor

bdarcus commented Oct 29, 2023

@torfjelde

So do we then close this PR in favour of the invisibility-solution? Or do we try to get this merged first, and then invisibility-solution afterwards?

Indeed, this is the question.

@roshanshariff @aikrahguzar - thoughts? He's referring to #808.

Edit: it does occur to me contributor names are unique among the fields, in that it might make sense to by default include the full name, or the given name only, in the invisible part of the candidate, but still leave room for people to include it in the visible part?

@roshanshariff
Copy link
Collaborator

I think it makes sense to make the name-shortening functions work by hiding (rather than deleting) the extra text. This should also work well with highlighting, since the entire text will match but only the visible parts will be highlighted. It'll need @aikrahguzar's modified truncation function to allow that though. It'll still be useful to have both full-name and last-name transformers, because people will have different preferences about what should be shown.

@bdarcus bdarcus changed the title Display transform function for full namees Display transform function for full names Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants