Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syntax highlighting doesn't work well on files with .tsx extension #7039

Open
vicasas opened this issue Sep 15, 2024 · 5 comments · May be fixed by #7049
Open

Syntax highlighting doesn't work well on files with .tsx extension #7039

vicasas opened this issue Sep 15, 2024 · 5 comments · May be fixed by #7049
Labels

Comments

@vicasas
Copy link

vicasas commented Sep 15, 2024

Describe the bug

When opening a pull request or when the code is merged into the main branch, the syntax highlighting for .tsx TypeScript React files doesn't work properly, leaving part of the file in plain text.

Expected behaviour

The expected behavior should be to always display the syntax highlighting that corresponds to the .tsx extension.

Additional notes

As additional information, I have managed to correct this by importing React at the beginning of each file.

import * as React from 'react';

But this should not be necessary today, since it is not an obligation to have to import React.

Captura de pantalla 2024-09-15 a la(s) 11 33 13
@vicasas vicasas added the Bug label Sep 15, 2024
@lildude
Copy link
Member

lildude commented Sep 15, 2024

This will be because the .tsx extension is shared with XML and the heuristic to differentiate the two is looking for react:

- extensions: ['.tsx']
rules:
- language: TSX
pattern: '^\s*(import.+(from\s+|require\()[''"]react|\/\/\/\s*<reference\s)'
- language: XML
pattern: '(?i:^\s*<\?xml\s+version)'

If the file doesn't match that or the XML heuristic, things fall through to the classifier which uses the samples we have to guess and all but one of the samples have React in them.

There are two solutions:

  1. Improve the heuristic or
  2. Add more samples which don't have React in them to better train the classifier.

The former is more reliable but likely to be harder.

If you know of a reliable regex that can be used, please feel free to submit a PR to improve the heuristic.

@vicasas
Copy link
Author

vicasas commented Sep 15, 2024

@lildude Oh, I see! I understand that this requires importing React for proper detection. However, I have two questions:

  1. In what cases could a file with the .tsx extension be treated as XML? I wonder if it might be better to remove this possibility, or is it intended for specific cases like React Native?

  2. Would it be possible to reverse the matching logic? That is, first attempt to matchXML, and if it doesn’t, assume by default that the file isTSX.

On the other hand, I wonder what makes this file have proper syntax highlighting without importing React. However, if I remove most of the HTML or the class names, it turns into a plain text file. The file extension is .tsx.

Captura de pantalla 2024-09-15 a la(s) 7 59 49 p  m \n

The same thing happens with this other .tsx extension file, which has React components, but its syntax highlighting seems fine.

Captura de pantalla 2024-09-15 a la(s) 8 33 56 p  m

@lildude
Copy link
Member

lildude commented Sep 16, 2024

1. In what cases could a file with the .tsx extension be treated as XML? I wonder if it might be better to remove this possibility, or is it intended for specific cases like React Native?

If the file contains XML 😉 From the comments in #2761 and #3464 these appear to be tileset map files.

2. Would it be possible to reverse the matching logic? That is, first attempt to matchXML, and if it doesn’t, assume by default that the file isTSX.

It won't make a difference as both heuristics are quite precise with neither being a fallback "default" option. The reason we're getting the misclassification is the files aren't matching either regex so it's falling through to the classifier.

On the other hand, I wonder what makes this file have proper syntax highlighting without importing React.

I can't determine the source of your files so I can't test it and be precise, but I suspect the content contains tags in common with those in the samples which are used to train the classifier and the classifier has got it right this time.

@vicasas
Copy link
Author

vicasas commented Sep 16, 2024

It won't make a difference as both heuristics are quite precise with neither being a fallback "default" option. The reason we're getting the misclassification is the files aren't matching either regex so it's falling through to the classifier.

Could we make it assume that it is a .tsx file if it doesn't match any heuristics, perhaps using a negative_pattern?

I can't determine the source of your files so I can't test it and be precise, but I suspect the content contains tags in common with those in the samples which are used to train the classifier and the classifier has got it right this time.

If it is of any use, I leave below the complete code that the classifier does classify correctly without the need to use a React import.

Example
import Image from 'next/image'
import styles from './page.module.css'

export default function Home() {
  return (
    <main className={styles.main}>
      <div className={styles.description}>
        <p>
          Get started by editing&nbsp;
          <code className={styles.code}>src/app/page.tsx</code>
        </p>
        <div>
          <a
            href="https://vercel.com?utm_source=create-next-app&utm_medium=appdir-template&utm_campaign=create-next-app"
            target="_blank"
            rel="noopener noreferrer"
          >
            By{' '}
            <Image
              src="/vercel.svg"
              alt="Vercel Logo"
              className={styles.vercelLogo}
              width={100}
              height={24}
              priority
            />
          </a>
        </div>
      </div>

      <div className={styles.center}>
        <Image
          className={styles.logo}
          src="/next.svg"
          alt="Next.js Logo"
          width={180}
          height={37}
          priority
        />
      </div>

      <div className={styles.grid}>
        <a
          href="https://nextjs.org/docs?utm_source=create-next-app&utm_medium=appdir-template&utm_campaign=create-next-app"
          className={styles.card}
          target="_blank"
          rel="noopener noreferrer"
        >
          <h2>
            Docs <span>-&gt;</span>
          </h2>
          <p>Find in-depth information about Next.js features and API.</p>
        </a>

        <a
          href="https://nextjs.org/learn?utm_source=create-next-app&utm_medium=appdir-template&utm_campaign=create-next-app"
          className={styles.card}
          target="_blank"
          rel="noopener noreferrer"
        >
          <h2>
            Learn <span>-&gt;</span>
          </h2>
          <p>Learn about Next.js in an interactive course with&nbsp;quizzes!</p>
        </a>

        <a
          href="https://vercel.com/templates?framework=next.js&utm_source=create-next-app&utm_medium=appdir-template&utm_campaign=create-next-app"
          className={styles.card}
          target="_blank"
          rel="noopener noreferrer"
        >
          <h2>
            Templates <span>-&gt;</span>
          </h2>
          <p>Explore starter templates for Next.js.</p>
        </a>

        <a
          href="https://vercel.com/new?utm_source=create-next-app&utm_medium=appdir-template&utm_campaign=create-next-app"
          className={styles.card}
          target="_blank"
          rel="noopener noreferrer"
        >
          <h2>
            Deploy <span>-&gt;</span>
          </h2>
          <p>
            Instantly deploy your Next.js site to a shareable URL with Vercel.
          </p>
        </a>
      </div>
    </main>
  )
}

For now what I did to avoid the syntax problem was to make an overrides using the .gitattributes file. This apparently only works when the code reaches the corresponding branch, since when opening a pull request it is still seen as plain text. Is there any configuration for this?

# Reclassifying `.tsx` files as `tsx` to fix GitHub syntax highlighting issues.
# Current Linguist heuristics do not correctly recognize `.tsx` files without `import React`.
# https://github.com/github-linguist/linguist/issues/7039
*.tsx linguist-language=tsx

@lildude
Copy link
Member

lildude commented Sep 17, 2024

Could we make it assume that it is a .tsx file if it doesn't match any heuristics, perhaps using a negative_pattern?

It already is a .tsx file 😉 If you mean TSX, yes, switching the order and removing the TSX pattern will do the trick… if XML isn't matched, use TSX. This however becomes a problem in future if a new language starts using the same extension but I think that's a hurdle that can be dealt with then. The XML heuristic is specific enough for now. Feel free to submit a PR with this change.

Is there any configuration for this?

No.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants