Full Text Searchable Collections

"Lumières dans la nuit" (LDLN), Inforespace, "inforespace hors série", "inforespace bulletin d'information" and "sobeps flash": https://sceau-archives-ovni.org/mags/fts/fts.html

"The Flying Saucer Review" (FSR). Full collection, including case histories : https://sceau-archives-ovni.org/mags/fts/fts_all.html

The UFO Investigator (NICAP). Full collection : : https://sceau-archives-ovni.org/mags/fts/fts_all.html


Semantic Searchable Collections

"Lumières dans la nuit" (LDLN) : https://sceau-archives-ovni.org/mags/fts/semantic_ldln.html

"The Flying Saucer Review" (FSR). Full collection : https://sceau-archives-ovni.org/mags/fts/semantic_fsr.html


Where to get access to the collections

La revue LDLN existe bien toujours, vous pouvez vous abonner. C'est vivement conseillé pour sa pérénité. https://www.ldlnufologie.com/

La revue d'ufologie LDLN détient le record mondial de longévité dans ce domaine.

LDLN1 to LDLN300 : https://files.afu.se/Downloads/Magazines/France/LDLN/

LDLN300+ (some of them) can be found at the SCEAU association server for members, or on request to the AFU for research purposes.

Inforespace collection : https://files.afu.se/Downloads/Magazines/Belgium/Inforespace/

FSR collection : https://www.saturdaynightuforia.com/html/libraryufospecialtymagazines.html, and also probably available from the AFU on request. SCEAU also has a partial collection (paper)

UFO Investigator : http://www.cufos.org/UFOI_and_Selected_Documents/UFOI/


Semantic Search Engine

For that search engine, write a short summary of what you are looking for, phrased like it would appear in the review.

The meaning of your sentence will be compared to the meaning of each and every paragraph in the collection.

The "most relevant" paragraphs will be listed. This using an AI model.

Providing emphasis or slight repetition can help get more relevant answers.

60 replies are generated, whatever. Even if none is actually satisfactorily relevant to your taste.

Usually, it does not index the familly names or very specific things in general. For these, the Full Test search engine is what you need.



This tool allows you to do a semantic search on a set of documents.

Understand that this search tool tries to find the pieces of text that best matches your prompt.

This means that you don't need to formulate your text as a question, but rather as a piece of text that you would like to find.

You may try:

Electromagnetic effects in proximity of flying saucers

Violation of the known laws of physics by flying saucers

the purpose of the flying saucer review editors

the light from the ufo did not light the environment

List the major contributions of James McDonald for the field of ufology.

the list of books that any serious ufo investigator should read

Example of search phrase that will still work, but only if the answer is somewhat present in the text

What is the most common color of UFO ?

Example of search phrases that will still work, somehow if you are lucky

How big has been the biggest UFO seen ?

Example of search phrases that usually don't work well, probably because of a low semantic content. A basic full text search works better with these

Jesse Roestenberg



Full Text Search engine

It is a very good complement to the Semantic search tool, when you can provide an exact request. Especially for familly names, where the Semantic Search engine is very weak.

No boolean operations. Case insensitive. Minimum 4 characters for the search. Maximum 30.

Only exact phrase match. Requires to make a smart use of it.

This means you should use the smallest possible and most discriminant search phrase.

Extremely Basic. For instance, if in the issue, two consecutive words are on 2 lines, the match may not happen. So that sometimes, a little luck is needed.

The web server is hosted on a shared server, which means it won't accept a heavy load of users.


Troubleshooting locating the same text in the pdf.

Once an issue was identified and you have opened it in acrobat reader...

Even if the pdf has been 'ocred', Acrobat Reader may not find the text you type, even if it is obviously there and that text was found by the online search engine.

It can happen because sometimes, the ocred text gets a bit scrambled like t h i s. So, sometimes, you will need to locate it more 'manually'. We are still trying to solve this annoyance.


OCR

Limitations of the automated characters recognition : no OCR is perfect, especially some of the scans were of low quality, resulting in far from perfect OCR.

Thus the search won't be perfect. But fair enough as far as i can tell.


Tools

Tools used for the generation of the index :

* Bulk Rename Utility

* tools-jpeg2pdf : https://github.com/albion2000/tools-jpeg2pdf : tools to help massive conversions from page scans to pdf documents ready for ocr using tools like Adobe Acrobat DC. This includes a tool for extracting the text from pdf files.

* OCR by SCEAU using Nuance PDF Create.

* https://github.com/albion2000/semanticSearch a fork I did. The original project was using a large language model in order to get directly a sort of reply. However, due to the limitation of the size of the paragraphs analyzed, and the complexity and variability of the ufo phenomenology, it cannot be as smart as you are at this stage. But that may change soon. 2 suitable language models are listed for french and english. It ingests documents and provides a web interface for the search.


Endeavour

The semantic search tool requires a big database. It is problematic now to extend it to much bigger collections.

I described here an original goal for a better full text search : https://sceau-archives-ovni.org/mags/fts/spec/full_text_search_v2.01.pdf

If you are motivated to work with me on that, tell me. lcdvasrm at free.r


JSON Format output

If you have a serious application that would profit from a JSON output format, ask for it.




Laurent chabin 20190727-20230908