Skip to content

[css-text-4] hyphenate-character doesn't accept just a character #2809

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
litherum opened this issue Jun 23, 2018 · 13 comments
Closed

[css-text-4] hyphenate-character doesn't accept just a character #2809

litherum opened this issue Jun 23, 2018 · 13 comments
Labels
Closed Accepted by CSSWG Resolution css-text-4 i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response.

Comments

@litherum
Copy link
Contributor

litherum commented Jun 23, 2018

https://drafts.csswg.org/css-text-4/#hyphenate-character

Name: 'hyphenate-character'
Value: | auto | <string>

Shouldn't it accept just a single character? Or the prose should say everything except the first code point (or grapheme cluster) should be ignored? Or perhaps the property should be renamed to hyphenate-string?

@kojiishi
Copy link
Contributor

cc @r12a

@kojiishi
Copy link
Contributor

Good point. All hyphenations Android supports today are one code point (Hyphenator.cpp @ android.googlesource.com)

@litherum
Copy link
Contributor Author

litherum commented Jul 5, 2018

I would be wary of any place in CSS that operates on single code points instead of graphemes clusters. For example, if this feature works with arbitrary ASCII text, I see no reason why it shouldn’t work with combining accents and combining emoji (like the families). In general, humans (even programmers) don’t understand the relationship between code points and things they see on the screen.

@litherum
Copy link
Contributor Author

litherum commented Jul 5, 2018

That being said, I don’t think it’s very valuable to put emoji or tons of stacked combining marks in the string. Perhaps a better solution is for the property to just accept keywords for the various popular kinds of hyphens.

https://developer.microsoft.com/en-us/microsoft-edge/platform/usage/css/-webkit-hyphenate-character/

@kojiishi
Copy link
Contributor

kojiishi commented Jul 6, 2018

I like that idea. With keywords, we can also support more complex cases.

@kojiishi kojiishi added the i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. label Jul 6, 2018
@css-meeting-bot
Copy link
Member

The Working Group just discussed hyphenated character name, and agreed to the following:

  • RESOLVED: no to rename and add normative text desc how UI are allowed to truncate additional characters
The full IRC log of that discussion <dael> Topic: hyphenated character name
<dael> github: https://github.com//issues/2809
<bradk> Initial-text?
<dael> myles: it allows you to put in a whole string so hyphenate-character seems disingenuous
<dael> fantasai: I think from user view it's a single char that goes in there. hyphenate-string seems fine
<dael> fantasai: But I think we have impl shipping
<dael> myles: We are one of those impl. I was thinking you could go other direction and have accept a string and all grapheme clusters except first get ignored
<dael> fantasai: Happy to have UA truncate appropriately. Not sure first grapheme cluster is enough. There's CJK punt that's >1 code point. If that's covered seems fine
<dael> myles: How about we don't have to specific details and have that sep issue. We can say text is truncated to something similar to a grapheme cluster
<dael> fantasai: At min include 1 grapheme cluster
<dael> fantasai: Other comments?
<dael> fantasai: Propsal: hyphenate-character keeps name, but UI may/can/should truncate if it's more than one grapheme cluser
<dael> myles: Should be normative
<dael> florian: Decide how strict?
<dael> florian: If we'r enot sure may/can/should we should decide.
<dael> Rossen: Let's resolve on renaming property. We're saying no and add normative text desc how UI are allowed to truncate additional characters
<dael> Rossen: Objections?
<florian> I'd go with "required to truncate" rather than "allowed to truncate"
<dael> RESOLVED: no to rename and add normative text desc how UI are allowed to truncate additional characters

@r12a
Copy link
Contributor

r12a commented Aug 1, 2018

fwiw, I checked out the use of hyphenate-character at the link in #2809 (comment). I could only find it being used in two sites (exlibris and hurriyetaile), and in both cases the 'string' identified was "\2010", ie. not a string at all.

I don't know of any real world hyphenation approach that uses multiple characters, or have decompositions into multiple characters (though that's not to say that there aren't any).

@fantasai
Copy link
Collaborator

fantasai commented Sep 13, 2018

Committed text allowing (MAY) the UA to limit the hyphenation string to a single typographic character unit in bfb538a

I wanted to check with the WG on a few points, though:

  • Should we limit it to a single character unit or the UA's choice of N character units?
  • Should this affect the computed value or only the used value?
  • Should it be MAY or MUST?

Also, @r12a, do you have an example I can use that's not "\2010"? It's not a very good example for the spec since hyphenate-character: "\2010" is essentially a no-op...

@litherum
Copy link
Contributor Author

litherum commented Sep 14, 2018

Committed text allowing (MAY) the UA to limit the hyphenation string to a single typographic character unit in bfb538a

I wanted to check with the WG on a few points, though:

  • Should we limit it to a single character unit or the UA's choice of N character units?

Let the UA just do the right thing here. If the author supplies some ridiculous long string, I don't think I would consider content "broken" that cuts it in one place as opposed to another place. So it's probably okay if browsers differ here.

Similarly, it's conceivable that an implementation might not handle hyphenating with multiple characters at all, and such an implementation probably shouldn't be nonconformant.

  • Should this affect the computed value or only the used value?

It's conceivable that an implementation might decide to cut the string at different places depending on the font size and the width of the containing block. This information isn't known at style resolution time, so not affecting computed value would be better.

  • Should it be MAY or MUST?

I'd say MUST, because if the UA doesn't do any cutting at all, it's broken.

Also, @r12a, do you have an example I can use that's not "\2010"? It's not a very good example for the spec since hyphenate-character: "\2010" is essentially a no-op...

@kojiishi
Copy link
Contributor

Also, @r12a, do you have an example I can use that's not "\2010"? It's not a very good example for the spec since hyphenate-character: "\2010" is essentially a no-op...

How about U+1400 CANADIAN SYLLABICS HYPHEN?

@fantasai
Copy link
Collaborator

@litherum OK, uncommented the N-character variant of the prose and specified the used value. :)

@kojiishi Great, I updated the example. If anyone knows a language code for it, that might be nice to add to the example as well.

@r12a
Copy link
Contributor

r12a commented Sep 17, 2018

If anyone knows a language code for it, that might be nice to add to the example as well.

Based on fig. 1 at http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3427.pdf, you could safely use ojs.

@fantasai
Copy link
Collaborator

@r12a Thanks, done. :)

jakearchibald pushed a commit to jakearchibald/csswg-drafts that referenced this issue Jan 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Closed Accepted by CSSWG Resolution css-text-4 i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response.
Projects
None yet
Development

No branches or pull requests

5 participants