General > General Technical Chat
Scientific publishing
thm_w:
--- Quote from: CatalinaWOW on October 25, 2023, 11:09:41 pm ---I would agree that the people who are truly superior at something and truly cannot communicate are rare.
--- End quote ---
Add to that people who can't perfectly communicate in their non-native language of english.
75-90% of papers are published in english. You are a lot less likely to be cited in a non-english language paper. Wanting to spend your time on something other than learning english seems like it should be an allowable choice to make. Some people are terrible at learning a language but amazing scientists.
Nominal Animal:
--- Quote from: CatalinaWOW on October 25, 2023, 11:09:41 pm ---the publication system is broken because publication rather than quality is in the self interest of a large fraction of the community.
--- End quote ---
Yes: I also believe that is the largest contributing factor.
--- Quote from: CatalinaWOW on October 25, 2023, 11:09:41 pm ---True AI? I am not sure we can even define what intelligence is.
--- End quote ---
The g-factor as used in psychometrics, measured by the statistical ability to solve problems one has not encountered before, is a useful definition.
I tend to default to that one, until a more useful definition happens to crop up.
--- Quote from: CatalinaWOW on October 25, 2023, 11:09:41 pm ---Calling the large memory models Babel babble generators overstates the case.
--- End quote ---
I disagree, obviously, but here is the reason: the models construct sentences based on a massive set of internally weighed statistical relationships. (Look up machine learning transformer model for a better description of that.)
LLMs do not consider the content of any word –– technically speaking, the exact difference between the token at hand and its nearest neighbours. At best, you can claim they roughly model the relative magnitude of such differences.
One definition of to utter meaningless words is to babble. I am using that as the technical exact definition for the output, since the LLMs cannot know the meaning of the individual words, only their relationships to each other. The latter is also why the output seems intelligent, but is not. (A comparison to a very powerful search engine over its source material using fuzzy matching is also apt.)
I do recognize that the human language acquisition process also starts with a similar charting of relationships. However, by interacting with others (especially face-to-face, so that body language and microexpressions will affect our own understanding of each term) we refine the meaning the word has for ourselves. Similarly, sentence construction, word order, and so on are built in interaction, with each interaction adding meaning (based on the difference in original intent versus reaction observed) on top of those associations.
--- Quote from: CatalinaWOW on October 25, 2023, 11:09:41 pm ---I would agree that the people who are truly superior at something and truly cannot communicate are rare. But there is a broad spectrum of this capability just as in any other area of human performance.
--- End quote ---
Sure: just look at my own output. I often fail English. Like LLMs, my output is verbose and typically well-structured, yet I still fail, because of lack of face-to-face use (which leads to the lack of direct feedback bypassing my conscious mind to my language understanding), and failure to predict how specific terms and sentences are understood/perceived/assigned meaning by others.
I do believe that if we develop LLMs into a tool that can track its sources, and use models generated from controlled datasets, we can build tools that would help a lot with especially scientific communication.
In simple terms, that corresponds to creating LLMs that can translate jokes and anecdotes, perhaps even poems, across languages while still tracking the reasons for its choices (as, for example, references to the strongest source materials affecting its choice).
To continue my gun analog, those would correspond to bolt guns, dart guns with a variety of medical substances available, guns designed for shooting blanks at short range safely (for use in entertainment), and so on. We just are not there yet. Nobody seems even remotely interested in developing such, in fact. Instead, LLMs are used as if they were already 'there', with the end result that they only make it easier for those who do not have anything meaningful to say to couch that non-message in attractive outer shape. Thus, my opinion that they just cause more shit to be generated.
(It is interesting to compare LLM proponents' assertions and beliefs to those of explosive and weapon inventors. Belief that sufficiently efficient killing machines would prohibit wars and save lives, due to the excessive cost in human lives, has been common. But perhaps this comparison is too 'angry', and something like adding tetraethyl lead to gasoline to aid gasoline engine efficiency, would be more apt. Or perhaps that too is 'too negative' for the LLM proponents.)
One of my hobbies is looking for science fiction stories with interesting storylines. I'm not that interested in the characters per se, I'm mostly interested in the events depicted. Many aspiring authors are now using LLMs to "flesh out their ideas", and the output (from my point of view) is so crap and waste of my time, that I've started to avoid looking at the output of new authors altogether. Granted, perhaps my view of LLM use is overly negatively colored because of this, but having an academic background myself, I do not see scientific authors behaving any different.
jpanhalt:
--- Quote from: thm_w on October 25, 2023, 11:31:01 pm ---
--- Quote from: CatalinaWOW on October 25, 2023, 11:09:41 pm ---I would agree that the people who are truly superior at something and truly cannot communicate are rare.
--- End quote ---
Add to that people who can't perfectly communicate in their non-native language of english.
75-90% of papers are published in english. You are a lot less likely to be cited in a non-english language paper. Wanting to spend your time on something other than learning english seems like it should be an allowable choice to make. Some people are terrible at learning a language but amazing scientists.
--- End quote ---
If one considers scientific writing per se, it can be divided into three efforts, excluding such things as copy editing, addressing reviewers comments and so forth. Those three efforts might be called literature review & citation, composition, and language translation.
I am concerned about the composition part, be it a ghost writer or AI. Computer translation is not an issue, and computer aided literature review has been around for more than 50 years. Ghost writers are generally revealed in the paper, either under the authors' names or in acknowledgement. And in my view, regardless of how the text is created, those whose names appear as authors need to be held fully accountable with no excuses. That was not the case in the Robert Good instance nor in any other scandal of which I am aware.
hans:
Most of my paper "reading" doesn't even include reading text. I may read the abstract and introduction/conclusion. But most of the searching and filtering is done by browsing figures and only if something looks interesting, I will look at the text for details. I suppose its similar when browsing datasheets. Figures.. tables.. thats where the content is at.
In all papers I wrote, the journey starts with selecting material and getting the math & figures nicely laid out. We read papers from figures.. we don't write novels, and so I don't think an AI will offer much assistance in the actual content side of a paper.
Of course there still needs to be some text, and you don't need AI to have hot takes. It happens all the time.. but lets not forget that papers are not books, and I think its hazardous to treat them as 100% factual. The qualitative reasoning, especially in introduction/conclusion, can be quite opinionated. Sometimes I find it even hilarious.
Personally I do think there there is some place for language tools in writing, but I don't think its AI. I don't want an AI that helps me with signposts and jargon; thats going to end terribly. Grammar, spelling, is perhaps more useful.. but those tools already exists, and we still have papers full of mistakes (although not every mistake makes them invaluable). So if any, I agree, its not going to change for the better with AI.. but I don't think it will end in disaster neither.
Infraviolet:
So, by having an LLM "assist"in writing paper they can now ever more closely resemble existing papers, which are already mostly so badly written as to be unintelligible (too much focus on compressing meaning in to short word counts but with long sentences, never really explaining anything properly, citing loads of prior work but not really making any summary of it or saying what aspect of it is relevant...).
If academic writing is to be improved it needs more authors stepping away from the abysmal conventions it has gotten stuck in, not letting an AI entrench those conventions further.
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version