Jump to content

Help talk:CirrusSearch

Add topic
From mediawiki.org
Latest comment: 28 days ago by Speravir in topic not displayable chars U+10FFFF

Does boost force sort=relevance?

[edit]

If I am using a defined sort order, such as, say, &sort=last_edit_desc, what happens if I choose a bounded or boosted search on, say, 100km,San Francisco ? Does it turn off my sort selection, and switch it to relevance instead? If not, what is the meaning of a search which contains both of those (or any sort option other than relevance) ? (subscribed) Mathglot (talk) 05:47, 19 December 2024 (UTC)Reply

How to search the fields of the File information template on Commons?

[edit]

Nearly all of the over 110 million files on Commons use this standardized template that specifies various useful metadata like date taken and file description. How to make use of this data and search it?

For example, how could one search file description for a term like "Kathmandu" (as asked about by another user) like on can search with intitle. description:"Kathmandu" does show some results but I don't know what it does and the results don't have that word in their description. I could not find info on this at mw:Help:CirrusSearch either. Info how to search specified fields of c:Template:Information should be added here.

EBernhardson (WMF) said Unfortunately, the image description is simply an argument to a template. CirrusSearch doesn't do anything at that level and can't be that specific.. I think the best workaround currently would be to use the insource search operator with the field name first so for example I searched for insource:"|source=[https://soundcloud.com to identify files for c:Category:Audio files from Soundcloud.com. I think easily searching fields of the File pages' Information template could be enabled by

  1. Developing some regex that searches for any content after e.g. |source=
  2. Creating some alias for it so instead of writing some complex regex query every time one can simply enter e.g. info-source:"soundcloud.com"

Please comment what you think about this proposed way to make this possible and if you have any info on what would be needed for that. Would be great if somebody could develop such (a) regex(es) if there is no better way to search specific fields of the Information template. It's great that files have that structured metadata but it could be much more useful if it was searchable.

Previously asked here. Maybe c:Module:Information could be used for this somehow. rspective (talk) 16:39, 5 March 2025 (UTC)Reply

Regex search speed

[edit]

In my experience bare regex searches seem to work even without any other terms. For example https://syl.wikipedia.org/w/index.php?go=Go&search=insource%3A%2F%5C%7B%5C%7BINTERWIKI%2F&title=%EA%A0%9B%EA%A0%A4%EA%A0%A1%EA%A0%A6%EA%A0%A1%3ASearch&ns0=1 completes quickly and returns 43 results. Why is that, and can the warning be removed? * Pppery * it has begun 14:24, 1 April 2025 (UTC)Reply

The regex search is from my understanding still searching the whole database unless the search area is narrowed with an index based parameter or filter. The database for syl-wiki will just be not very large so that the regex search is faster than the timeout. However, the help page is for every possible Mediawiki installation, so any potential caveat has to be addressed. โ€” Speravir (talk) โ€“ 23:47, 11 April 2025 (UTC)Reply

not displayable chars U+10FFFF

[edit]

In chapter "Substitutions for some metacharacters" columns 'CirrusSearch' and text , all chars explained as "๔ฟฟ" is U+10FFFF" are displayed by a default pavement char as for not existing chars.Adapt or useless ? Thanks. -- Christian ๐Ÿ‡ซ๐Ÿ‡ท FR ๐Ÿšจ (talk) 08:04, 7 April 2025 (UTC)Reply

You may see a โ€œdefault pavement charโ€, I see a glyph which has the unicode number (very small) imprinted. In general, you get a glyph of a font selected by your browser, if there is one; otherwise the behaviuor apparently depends on the browser. At this very place it means, there is actually the character U+10FFFF visible and you can execute a copy and paste action. โ€” Speravir (talk) โ€“ 00:05, 12 April 2025 (UTC)Reply