Author Topic: Converting long filenames to 8.3  (Read 1624 times)

0 Members and 1 Guest are viewing this topic.

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3696
  • Country: gb
  • Doing electronics since the 1960s...
Converting long filenames to 8.3
« on: August 06, 2022, 10:14:24 pm »
I am writing a simple web server which has a file upload feature.

FatFS is used to implement a FAT12 (2MB) flash file system, with LFNs disabled.

However, the windows file picker does make it possible for someone to select a long filename for upload.

Would FatFS convert this to 8.3 or do I have to do that?

And if FatFS doesn't do that (I suspect it will fail the f_open if it got a non 8.3 valid filename) can anyone suggest a way to do the conversion? There are various crude filename truncation routines out there. I think the one which windows uses needs to have access to the entire directory though, to avoid conflicts.

Many thanks in advance for any pointers.

I could just reject such a file at the upload point, as I have to do anyway if e.g. it is bigger than remaining filespace. But it would be better to truncate it.

I don't have any need to do the extra stuff e.g. rejecting filenames like COM2 LPT1 etc.


Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline Ed.Kloonk

  • Super Contributor
  • ***
  • Posts: 4000
  • Country: au
  • Cat video aficionado
Re: Converting long filenames to 8.3
« Reply #1 on: August 06, 2022, 11:47:01 pm »
I am writing a simple web server which has a file upload feature.

FatFS is used to implement a FAT12 (2MB) flash file system, with LFNs disabled.

However, the windows file picker does make it possible for someone to select a long filename for upload.

Would FatFS convert this to 8.3 or do I have to do that?

And if FatFS doesn't do that (I suspect it will fail the f_open if it got a non 8.3 valid filename) can anyone suggest a way to do the conversion? There are various crude filename truncation routines out there. I think the one which windows uses needs to have access to the entire directory though, to avoid conflicts.

Many thanks in advance for any pointers.

I could just reject such a file at the upload point, as I have to do anyway if e.g. it is bigger than remaining filespace. But it would be better to truncate it.

I don't have any need to do the extra stuff e.g. rejecting filenames like COM2 LPT1 etc.

My limited understanding of extended filenames when it was coming in to winx/dos was that the OS, for compatibility, bestowed a legacy truncated 8.3 filename and that whatever lib used the file services (old or new) could operate seamlessly, with varied results.

Unless, I don't fully understand your issue, whatever it is writing to the old filesystem should be truncating the filename, shouldn't it?
iratus parum formica
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14465
  • Country: fr
Re: Converting long filenames to 8.3
« Reply #2 on: August 07, 2022, 12:45:14 am »
FatFs will not do anything about it, especially if LFN is disabled.

If you're directly managing file creation with f_open() and the file names come from a HTTP request (as I suppose given your project), you will unfortunately have to deal with that yourself.

Since you're using FAT12, you can't support LFNs, so you don't need to use the "official" way of generating 8.3 names from LFNs. The original file names will be unrecoverable anyway. So you basically need to write a function that will transform the name: remove characters that are not supported for 8.3 FAT names, limit to 8 characters (+ 3 max for the extension if any). Now you'll also have to avoid file name collision - meaning that once you get such a 8.3 name, you need to check whether there isn't already a file with the same name in the same directory - if so, you'll have to transform the new name with a numbering scheme (while removing the necessary last chars of the 8 part of the name to accomodate). This is pretty tedious, but I don't see any other way.

Note that this is not just a problem of length. Only a restricted set of characters are valid for 8.3, so if the filename, even if short enough, contains any non-supported character, that will fail as well. If you have no control over the files the user can pick, then you need to handle this too.

One problem with this scheme is that if you want your users to be able to replace an existing file and said file originally has a long file name, then you can't: no way of knowing this was the same file name, so all you can do is save the new file while the old one will be still there with a different number (or no number if it was the first one.)

The joys of FAT.
« Last Edit: August 07, 2022, 12:57:17 am by SiliconWizard »
 
The following users thanked this post: peter-h

Offline ledtester

  • Super Contributor
  • ***
  • Posts: 3036
  • Country: us
Re: Converting long filenames to 8.3
« Reply #3 on: August 07, 2022, 01:42:57 am »
And if FatFS doesn't do that (I suspect it will fail the f_open if it got a non 8.3 valid filename) can anyone suggest a way to do the conversion?

What exactly is your use case? Do you have to remember the filename the user uploaded or why not just generate your own file name like from a random number or a sequence counter?

For instance, consider how this forum works. At the end of this message I've attached the same text file three times but they all have different links:

- https://www.eevblog.com/forum/programming/converting-long-filenames-to-8-3/?action=dlattach;attach=1559716
- https://www.eevblog.com/forum/programming/converting-long-filenames-to-8-3/?action=dlattach;attach=1559719
- https://www.eevblog.com/forum/programming/converting-long-filenames-to-8-3/?action=dlattach;attach=1559722
« Last Edit: August 07, 2022, 01:54:37 am by ledtester »
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3696
  • Country: gb
  • Doing electronics since the 1960s...
Re: Converting long filenames to 8.3
« Reply #4 on: August 07, 2022, 08:22:13 am »
This is as I suspected - not trivial.

The algorithm which M$ use can't be trivial because after truncation you need to keep incrementing the numeric part while checking for a name collision after each one. I don't think it is doing that, because on a PC you could have 10k-100k files (in a subdir, anyway) so it must grab a linked list of the files, sort it, and determine the first candidate for a truncated filename.

Also the directory is quite likely to already contain files with M$-truncated names. You can see one here



So just picking up one will yield an immediate collision :)

I will just refuse to upload a long filename...

The use case is an embedded filesystem, FAT12, LFN disabled (FatFS implementation) with a file upload feature, accessible via an HTTP server (in the pic) and I can't actually stop somebody trying to upload a long filename.

The filesystem is also accessible via USB, as a removable block device, and there windows just sees plain 512 byte sectors so it does what it likes in there. If you write a LFN that way, the embedded side sees just the 8.3 filename, so that works ok.

« Last Edit: August 07, 2022, 08:24:31 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline mariush

  • Super Contributor
  • ***
  • Posts: 5022
  • Country: ro
  • .
Re: Converting long filenames to 8.3
« Reply #5 on: August 07, 2022, 08:49:38 am »
Just truncate the file name to 6 or 8 characters. If truncated to 6, you could then add 1-2 characters (a counter or something) in case a file with same name already exists.

Without thinking too much, I would basically accept any file upload and save it as it comes as "TMPmmss" + maybe some characters that you otherwise wouldn't accept in the extension(so you know for sure there's no way such file name is already  or something like that), and when the transfer is done, then you have all the time in the world to check if the filename supplied to you by the user can be used, truncate it as needed, then check all the other files in that folder and rename as needed. When done, rename the existing file entry.

 

Online HwAoRrDk

  • Super Contributor
  • ***
  • Posts: 1477
  • Country: gb
Re: Converting long filenames to 8.3
« Reply #6 on: August 07, 2022, 10:31:57 am »
The algorithm which M$ use can't be trivial because after truncation you need to keep incrementing the numeric part while checking for a name collision after each one. I don't think it is doing that, because on a PC you could have 10k-100k files (in a subdir, anyway) so it must grab a linked list of the files, sort it, and determine the first candidate for a truncated filename.

I don't think you'd need to worry about excessive number of files making for a slow collision-finding operation, as IIRC FAT12 filesystems simply can't support a huge number of files. I think the filesystem as a whole has a total limit, not particularly per directory, although I believe the root directory is special and has a certain limit - 256 or 512 files, I forget exactly. I think maybe the limits also depend on how the filesystem is formatted - sector size, etc.
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3696
  • Country: gb
  • Doing electronics since the 1960s...
Re: Converting long filenames to 8.3
« Reply #7 on: August 07, 2022, 10:58:42 am »
One simple solution I have is to check the null-terminated filename string for a valid 8.3 name. There is probably code out there (I could not find C source, only Python type stuff which obviously makes it easy) or I can write some. And dump the upload otherwise.

Quote
Just truncate the file name to 6 or 8 characters. If truncated to 6, you could then add 1-2 characters (a counter or something) in case a file with same name already exists.

This raises an interesting option. If not valid 8.3 then I could upload with a temp filename and put in a Rename option. Actually, in the Edit function I already return the TEXTAREA into a temp file and then delete/rename if various tests passed, including checking filesize declared == #bytes written to disk. Then I delete original and rename the temp. But with the Upload function I can't do that because I have max 2MB space but in a valid scenario could be uploading a 1.9MB file.

Quote
I don't think you'd need to worry about excessive number of files making for a slow collision-finding operation, as IIRC FAT12 filesystems simply can't support a huge number of files. I think the filesystem as a whole has a total limit, not particularly per directory, although I believe the root directory is special and has a certain limit - 256 or 512 files, I forget exactly. I think maybe the limits also depend on how the filesystem is formatted - sector size, etc.

Yes. There is a 512 file limit in the root, and my web UI (and the embedded filesystem API) does not support subdirectories where you could have much more. It displays a subdir in the root, as you can see above. I can't rule out the possibility that somebody might create 500 files there and it needs to still work... With FatFS accessing the SPI FLASH at 21MHz the filesystem is extremely fast so that's not a problem, but I still have to write all the code :) OTOH checking if a file exists is a built-in FatFS function (f_open). The sectors are 512 bytes, for compatibility with Windoze which expects a 512 byte sector in a removable device.

I was also going to implement dir sorting
https://www.eevblog.com/forum/programming/fatfs-sorted-directory-listing-but-without-needing-a-large-ram-buffer/
but I am not doing that right now. This part of the project was originally contracted out but ran to too many hours so I stopped it and wrote it myself (~1 man-week so far) but that guy was implemented sorting with a JS script, which involved custom hidden tags of some sort. If I was doing it I would do it as per that link, in the server, but that's another story.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline ledtester

  • Super Contributor
  • ***
  • Posts: 3036
  • Country: us
Re: Converting long filenames to 8.3
« Reply #8 on: August 07, 2022, 11:19:10 am »

I will just refuse to upload a long filename...

The use case is an embedded filesystem, FAT12, LFN disabled (FatFS implementation) with a file upload feature, accessible via an HTTP server (in the pic) and I can't actually stop somebody trying to upload a long filename.


Also bear in mind that the HTTP request may not have been generated by a file picker dialog and the filename in the request may be arbitrarily long and/or contain non-ASCII characters.

RFC6266 describes how filenames should be encoded in Content-Disposition headers:

https://datatracker.ietf.org/doc/html/rfc6266
« Last Edit: August 07, 2022, 11:20:46 am by ledtester »
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3696
  • Country: gb
  • Doing electronics since the 1960s...
Re: Converting long filenames to 8.3
« Reply #9 on: August 07, 2022, 11:29:18 am »
A valid point but in this case the file picker is triggered by a JS script

Code: [Select]
<html lang="en">
<head>
    <title>KDE485</title>
    <meta http-equiv=Content-Type content="text/html; charset=utf-8">
</head>
<body>
<h2>Upload file</h2>
<input type="file" id="file">
<input type="submit" value="Upload" id="submit">
<a href="/files.html"><button>Cancel</button></a>
<div id="progress">Please select a file to upload and then click Upload</div>
<script>
function updateProgress(evt)
{
   if (evt.lengthComputable)
   {
     document.getElementById("progress").innerHTML = evt.loaded   " of "   evt.total   ", "   (evt.loaded/evt.total*100).toFixed(1)   "%";
   }
}

function uploadFile()
{
const fileInput = document.getElementById('file');

if(fileInput.files.length == 0) {
alert("Select a file to upload first");
return;
}

document.getElementById("submit").disabled = true;

const fileReader = new FileReader();
fileReader.addEventListener("load",   function (e) {
const rawData = e.target.result;
const putRequest = new XMLHttpRequest();
putRequest.open("PUT", "/ufile="   fileInput.files[0].name);
putRequest.upload.addEventListener("progress", updateProgress, false);
putRequest.addEventListener("load", function (f) {
if(putRequest.status == 200 || putRequest.status == 201) {
document.getElementById("progress").innerHTML = '';
alert("Upload succeeded");
history.back();
} else {
alert("Upload failed with code "   putRequest.status);
document.getElementById("progress").innerHTML = 'Upload failed.';
document.getElementById("submit").disabled = false;
}
});
                putRequest.send(rawData);
    });
    fileReader.readAsArrayBuffer(fileInput.files[0]);
}

document.getElementById("submit").addEventListener("click", uploadFile);

</script>
</body>
</html>

So it should be the standard file picker for the OS in question, capable of selecting any file, and I have no control over anything. Actually the JS could be modified to do the 8.3 check.

I haven't actually tested the above code yet :)

Of course, as with any client-side stuff, this is easily hacked, but there is only so much I want to do in what is supposed to be a controlled environment (this http server will be specified as never to be on an open port, etc, in case the customer is a complete idiot).

EDIT: I have found that passing an invalid-8.3 filename to FatFS does fail upon opening the file. Tracing the code shows they do the various obvious checks for size, invalid chars, etc. Pretty good!
« Last Edit: August 07, 2022, 12:46:23 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline ledtester

  • Super Contributor
  • ***
  • Posts: 3036
  • Country: us
Re: Converting long filenames to 8.3
« Reply #10 on: August 07, 2022, 03:21:52 pm »
A valid point but in this case the file picker is triggered by a JS script

What I mean is that a user (more specifically a malicious user) can send a file upload request without using the HTML pages you have set up for the file upload. It's the same story of how hackers break into home routers, internet connected cameras and other IoT devices.

 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3696
  • Country: gb
  • Doing electronics since the 1960s...
Re: Converting long filenames to 8.3
« Reply #11 on: August 07, 2022, 03:50:25 pm »
Sure, but other than running this thing on a big 80x86 machine, with Centos, NGINX or Apache, with regular patches, one has to assume that - by installation - it isn't going to be open to the chinese and russians :)

There is a username+pwd login anyway, plus an optional client IP requirement which you can set to e.g. the VPN terminator of your remote access facility. Again, the username and password could be overloaded (the 20-char limit is implemented at the client) and then it is a matter of how solid LWIP is at dealing with oversize packets. I would think it chucks them out; it's an obvious thing.

You get this debate with all embedded systems. There could be a vulnerability at the low level ETH part, where a dedicated DMA controller is transferring incoming packets to a chain of buffers. One should limit the DMA transfer count to the MTU but you never know...

There is no support on low level STM 32F4 ETH, same with LWIP, same with FatFS. Well, there are forums and mailing lists, full of posts from variously desperate people, and almost nobody answering. How big is ST? 12.7BN annual sales, no support forum presence, and there is just one guy who knows anything (quite a lot actually) who is not even working for ST and occassionally posts something and only after telling you that you are a complete m0ron who cannot read. Welcome to the world of embedded "open source" software :) This is why "IOT" can't be on an open port (already widely debated).

I supervised a project (Centos etc as above) which does data validation at the client (JS) and does it all again at the server. It is about 3x more work but you have to it; it is a public site.

This is basically an industrial control product but with lots of applications outside. Probably the biggest attack vector is pure vandalism via an inside job, possibly aided by somebody leaking the WIFI pwd and then somebody getting in from a car parked outside. That vector would be the same even if this product was totally secure.
« Last Edit: August 07, 2022, 04:02:42 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline ejeffrey

  • Super Contributor
  • ***
  • Posts: 3717
  • Country: us
Re: Converting long filenames to 8.3
« Reply #12 on: August 07, 2022, 05:09:48 pm »
Do you have to use the directory to hold the filename?  If the only thing that is going to happen is downloading the files via the same HTTP server you could just use sequential filenames and store a separate mapping of the original filenames to the underlying sequential filenames.

That's a pretty lame thing to do because the whole point of a filesystem is to keep metadata in sync with file data but 8.3 is a pretty big restriction and if fatfs isn't tracking data you need then maybe you have to do it yourself.

Alternately still do your best to truncate names sensibly so you have sensible to fall back on if the long name database gets corrupted.
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3696
  • Country: gb
  • Doing electronics since the 1960s...
Re: Converting long filenames to 8.3
« Reply #13 on: August 08, 2022, 09:07:48 am »
I actually don't need to store LFNs in the target. I just need to prevent them being uploaded to it, but FatFS rejects them anyway (if LFN is disabled) because it already contains the exact 8.3-checking code I was looking for (which isn't that complicated).
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf