The standard code (this is mine so not as smart as people here would expect) gets filenames in the order in the directory+FAT.
/*
* Scans directory, fill array flist with the names, sizes and attribs, up to maxcount (each entry takes 17 bytes)
* Returns # of files actually found in fcount.
* The array is not initialised so you have to use fcount to determine how many filenames were loaded.
* Any filename shorter than 12 bytes (the 12 includes the dot) has a 0x00 appended.
* Directories are found also.
* Each entry is 17 bytes:
* 0..11 filename
* 12..15 filesize (uint32_t, LSB in byte 12)
* 16 attrib byte:
*
* #define AM_RDO 0x01 Read only
* #define AM_HID 0x02 Hidden
* #define AM_SYS 0x04 System
* #define AM_DIR 0x10 Directory
* #define AM_ARC 0x20 Archive
*
*/
uint32_t get_file_list( char *flist, uint32_t maxcount )
{
FRESULT fr; /* Return value */
DIR dj; /* Directory object */
FILINFO fno; /* File information */
uint32_t fcount=0; // # of files found so far
uint32_t idx=0; // index into flist
fr = f_findfirst(&dj, &fno, "", "*"); // Start to search for any file
while (fr == FR_OK && fno.fname[0] && maxcount) // Repeat while an item is found
{
// debug_thread_printf("%.12s", fno.fname);
memcpy(&flist[idx],&fno.fname,12);
flist[idx+12] = 0x000000ff & fno.fsize;
flist[idx+13] = (0x0000ff00 & fno.fsize) >> 8;
flist[idx+14] = (0x00ff0000 & fno.fsize) >> 16;
flist[idx+15] = (0xff000000 & fno.fsize) >> 24;
flist[idx+16] = fno.fattrib;
idx+=17;
fcount++;
maxcount--;
fr = f_findnext(&dj, &fno); // Search for next item
}
f_closedir(&dj);
return fcount;
}
It is 2MB which is FAT12 and I think the max # of files in root (no subdirs are supported in this case) is 512. Sorting these conventionally will need a lot of RAM. The names are limited to 8.3 (the LFN build option bloats the code quite a lot).
I was thinking of a rather crude method, like this:
Initialise a uint64_t smallest_name = 0xffffffffffffffff
Do one directory scan. Load each filename, left-justified and padded with trailing spaces, into a uint64_t, but in big-endian order, so you have a number which is a) unique for the filename and b) comparing two of these numbers will yield an alphabetical-equivalent comparison. Determine the "smallest" name file in the list. Set smallest_name to that. Display that filename / send it to the client / whatever.
Now scan the directory again, looking for the smallest name, but not smaller than the last one. When found, display that one.
Repeat until no more files.
So if you have 500 files you scan the dir 500 times, but with FatFS running on a FLASH (or RAM) disk this is very fast - around 500kbytes/sec on a 168MHz arm32.
Surprisingly, a search did not dig up anything like this (apart from the usual of people asking and getting no replies) so I wonder if there are slicker ways. This is a simple file listing function running in an HTTP server, to deliver a file listing to a client browser.
Here is what I was suggesting above – but beware, this hasn't even been compile-tested, so it likely has typos and such.
/* Per-file descriptor for directory scanning */
typedef struct {
uint32_t size; /* File size in bytes */
uint8_t attr; /* File attributes */
char name[13]; /* File name as a C string, NUL-terminated, includes dot */
} dirscan_entry;
/* Directory scanning structure */
typedef struct {
int (*filterfunc)(const FILINFO *);
int (*sortfunc)(const dirscan_entry *, const dirscan_entry *);
DIR dirh;
dirscan_entry last;
} dirscan;
/* Fill dirscan_entry from the FILINFO structure.
Return FR_NO_FILE (if not a valid entry), or FR_OK (if a valid entry). */
static FRESULT dirscan_entry_fill(dirscan_entry *ent, FILINFO *info)
{
if (info->fname[0] == 0)
return FR_NO_FILE;
ent->size = info->fsize;
ent->attr = info->fattrib;
memset(ent->name, 0, sizeof ent->name);
if (sizeof info->fname >= sizeof ent->name)
memcpy(ent->name, info->fname, (sizeof ent->name) - 1);
else
memcpy(ent->name, info->fname, sizeof info->fname);
return FR_OK;
}
int dirscan_namesort(const dirscan_entry *ent1, const dirscan_entry *ent2)
{
const char *name1 = ent1->name;
const char *name2 = ent2->name;
<
while (1) {
unsigned char c1 = *(name1++);
unsigned char c2 = *(name2++);
/* Case folding? Not necessary on FATFS, but let's do it anyway. */
if (c1 >= 'a' && c1 <= 'z')
c1 = c1 - 'a' + 'A';
if (c2 >= 'a' && c2 <= 'z')
c2 = c2 - 'a' + 'A';
if (c1 != c2)
return (int)c1 - (int)c2; /* == c1 - c2, due to C integer promotions */
else
if (!c1)
return 0;
}
}
/* Begin directory scanning. Returns FR_OK if valid, an error result otherwise.
filterfunc() is optional; it returns zero if a file is to be skipped, nonzero to list.
sortfunc(a, b) compares two dirscan_entries, and returns negative if a < b,
zero if a == b, and positive if a > b.
*/
FRESULT dirscan_begin(dirscan *ds, const char *path,
int (*filterfunc)(const FILINFO *),
int (*sortfunc)(const dirscan_entry *, const dirscan_entry *))
{
FRESULT result;
/* Remember the filter function. It is optional. */
ds->filterfunc = filterfunc;
/* Remember the sort function. NULL defaults to dirscan_namesort. */
if (!sortfunc)
ds->sortfunc = dirscan_namesort;
else
ds->sortfunc = sortfunc;
/* Clear the last entry to all zeros. */
ds->last.size = 0;
ds->last.attr = 0;
memset(ds->last.name, 0, sizeof ds->last.name);
/* Open the specified path. */
return f_opendir(&(ds->dirh), path);
}
FRESULT dirscan_end(dirscan *ds)
{
return f_closedir(&(ds->dirh));
}
/* Repeatedly call this function with a dirscan handle,
with an array of dirscan_entry ent[ents_max];,
until the function returns zero (complete) or negative (error).
The function returns the number of entries in sorted order
filled in the ent array.
*/
int dirscan_get(dirscan *ds, dirscan_entry *ent, int ents_max)
{
FILINFO info;
dirscan_entry tempent;
FRESULT result;
int ents = 0;
if (ents_max < 1)
return -1; /* TODO: Better error reporting? */
f_rewinddir(&(ds->dirh));
while (1) {
result = f_readdir(&(ds->dirh), &info);
if (result != FR_OK)
break;
/* If there is a filter function, apply. */
if (ds->filterfunc && !ds->filterfunc(&info))
continue; /* Filter returned zero, so we skip this entry. */
result = dirscan_entry_fill(&tempent, &info);
if (result != FR_OK)
break;
/* If we have already seen this entry, skip it. */
if (ds->last.name[0] && ds->sortfunc(&(ds->last), &tempent) <= 0)
continue;
/* If we have no entries yet, this will be the first one. */
if (!ents) {
ent[0] = tempent;
ents++;
continue;
}
/* Is this a new last entry? */
if (ds->sortfunc(ent + ents - 1, &tempent) <= 0) {
/* Append if array is not full yet. */
if (ents < ents_max)
ent[ents++] = tempent;
continue;
}
int enti = 0; /* Entry insertion index */
/* If the new entry is not new first entry in the array, .. */
if (ds->sortfunc(ent, &tempent) <= 0) {
/* .. do a binary search. We know it is after [0] but before [ents-1]. */
int entj = ents - 1;
while (entj > enti) {
const int i = (enti + entj) / 2;
if (ds->sortfunc(&tempent, ent + i) <= 0)
entj = i;
else
enti = i;
}
}
/* If the array is full, drop the last entry. Otherwise, keep all. */
if (ents >= ents_max) {
memmove(ent + enti + 1, ent + enti, (ents_max - enti - 1) * sizeof ent[0]);
ent[enti] = tempent;
} else {
memmove(ent + enti + 1, ent + enti, (ents - enti) * sizeof ent[0]);
ent[enti] = tempent;
ents++;
}
}
if (result != FR_OK && result != FR_NO_FILE)
return -1; /* TODO: Better error reporting to caller? */
/* If we got any entries, save the last one. */
if (ents > 0)
ds->last = ent[ents - 1];
return ents;
}
The idea is that you define a sort function (or use the above dirscan_namesort() which is used by default if you supply NULL sort to dirscan_begin()), and optionally a filter function (for example, to not list hidden files, directories, or the . and .. entries).
To list the contents of say the root directory, you use
dirscan handle;
if (dirscan_begin(&handle, "/", filterfunc_or_NULL, dirscan_namesort) != FR_OK) {
/* Failed, probably a path error! */
} else {
dirscan_entries ent[15];
int ents;
while ((ents = dirscan_get(&handle, ent, sizeof ent / sizeof ent[0])) > 0) {
/* output ent[0 .. ents-1], inclusive */
}
dirscan_end(&handle);
}
Note that this works with FF_USE_FIND=0 (and FF_FS_MINIMIZE=0 or 1).
The smaller the ent[] array is, the more times one needs to read through the directory contents.
uint32_t get_file_list( char *flist, uint32_t maxcount )
...
memcpy(&flist[idx],&fno.fname,12);
...
So if you have 500 files you scan the dir 500 times, but with FatFS running on a FLASH (or RAM) disk this is very fast...
why? you already saved everything in flist, why the need to rescan 500x? just run through the flist!
Another thing you can do is a indirect sort. Instead of sorting the data in memory, sort an array of index values. Instead of 512*11 bytes you would only need 512*2. With a bit of cleverness using a combination of merge sort and quick sort you should be able to do that in 512 bytes. The downside of this is that you have a lot more random IO. Also 512 bytes is a lot better than kB but that might still be too much.
since flist is fixed 17 bytes, he only needs extra 17 bytes (char) as temporary space for swapping the flist..
I think a reasonable solution to this, which can also address other stuff in the future, is to simply push out a block of data to the client browser at the start of each web page. This data will be from a file in the filesystem. So one can send out some javascript to do the sorting. Then one can do all kinds of clever stuff like sorting on different fields. And the JS will be trivial to change just by editing that file.
now you got the logic sorted out. let the content of this thread for other readers to get an idea from ;) cheers.
No. It is because peter-h uses f_findfirst() and f_findnext(). Each call to either of these scans once through the file list, to obtain a single (the first or the next) entry.
but he already saved/duplicate every search result he needs in flist ???
Only up to maxcount. peter-h did not implement a way to obtain more than the first maxcount entries of a directory.
indirect pointer method suggested by mariush is the way
No, it will not work reliably, because FAT does not have inode numbers. Consider what happens if the directory is modified between indirect index generation and its use: the wrong file will be referenced. Horrible; I'd never accept software that does that sort of a thing.
What does my solution do, then? It may show stale data (deleted or entries that have since been renamed), but never confuse them, since each dirscan_get() does a full pass over the directory contents.
but if i read correctly, its the sorting problem (when he already has flist).
The way I read the question is: "I have very limited memory, not enough to store the names and sizes and attributes of all directory entries in a directory. How can I provide a list of the directory entries to a remote browser client, in sorted order?"
His example code (not a solution, but a reference) uses f_findfirst()/f_findnext() to read as many of the entries that fit in the given buffer. It neither sorts nor supports more directory entries that can fit in memory.
My suggestion describes code that acts like f_findnext(), but provides the next ents_max entries of the directory, optionally filtered, (and the entire directory) sorted using an arbitrary function, doing only one pass over the directory per call. It has a somewhat similar interface to POSIX.1 scandir() function (which is in all ways superior to opendir()/readdir()/closedir() that idiots continuously suggest to new programmers even on POSIX-compatible OSes), but needs to be called repeatedly until all entries have been reported. That is, it solves both the excess directory scanning problem (f_readdir() being used, instead of f_findfirst()/f_findnext(), thus only number_of_entries/ents_max passes over the directory is done), the sort problem (by letting the caller specify the sort function on dirscan_begin() call), the unstated filtering problem (you don't normally want to list all files; you may wish to e.g. omit hidden or non-readable files in such file lists), and the memory limitation (it works even when you supply a buffer of just one entry, although that's a bit silly).
Since then, peter-h has pivoted to a different approach, somewhat dropping the entire question. This approach is to emit the directory listing as a JavaScript-enhanced HTML page, and do any sorting on the client side. This means that to a HTTP directory listing request, the response boils down to something like the following:
FRESULT http_dirlist_response(connection_t *client, const char *path, int (*filter)(FILINFO *))
{
DIR dh;
FILINFO fi;
FRESULT rc;
/* Open the specified path, to make sure it exists. */
if (!path || !*path)
rc = f_opendir(&dh, "/");
else
rc = f_opendir(&dh, path);
if (rc != FR_OK)
return rc;
/* Send the header part of the page.
We assume this ends in the middle of a JavaScript assignment, say
var fileList = [
*/
rc = http_dirlist_header(client, path);
if (rc != FR_OK) {
f_closedir(&dh);
return rc;
}
/* List each directory entry as a JavaScript object */
while (1) {
rc = f_readdir(&dh, &fi);
if (rc != FR_OK || !fi.fname[0])
break;
if (filter && !filter(&fi))
continue;
rc = http_dirlist_entry(client, &fi);
if (rc != FR_OK)
break;
}
/* If successful thus far, emit footer part of the page, starting with
];
to end the JavaScript directory entry object array. */
if (rc == FR_OK)
rc = http_dirlist_footer(client, path);
if (rc != FR_OK) {
f_closedir(&dh);
return rc;
}
return f_closedir(&dh);
}
where connection_t *is just a handle to the client, it could be anything, really.
The above refers to three other functions:- http_dirlist_header(connection_t *client, const char *path) - emitting the fixed initial part of the HTML directory listing to the user
- http_dirlist_entry(connection_t *client, FILINFO *entry) - emitting a single entry in the directory listing to the user
- http_dirlist_footer(connection_t *client, const char *path) - emitting the fixed final part of the HTML directory listing to the user
If specified, the filter(FILINFO *entry) function can be used to suppress listing of specific files, such as hidden files, and files used internally by this implementation (for example, in the root directory, describing the header and the footer of the HTML listing page). The entries themselves are not sorted, as that is done by the client-side Javascript.
If the footer part contains say
<table>
<thead>
<tr>
<th><a class="th" onclick="return sortType();">Type</a></th>
<th><a class="th" onclick="return sortName();">Filename</a></th>
<th><a class="th" onclick="return sortTime();">Modified</a></th>
<th><a class="th" onclick="return sortSize();">Size</a></th>
</th>
</thead>
<tbody id="listparent"></tbody>
</table>
then a simple JavaScript for loop can construct the dynamic HTML (<tr><td>Type</td><td>Name</td><td>Date Time</td><td>Size</td></tr>) for each directory entry in sorted order, combine them into one long string, and (re)display the list using
document.getElementById("listparent").innerHTML = stringCombiningTheTableRows;
If the header part ends with JavaScript <script type="text/javascript">var listing = [ and the footer part starts with ];, then defines the four sort functions (that return False), and finally adds document.addEventListener("load", sortName);, then when the page loads, the file list is initially shown as sorted by sortName().
As you can see from the above outline, this is a reasonable solution, pushing the memory-hungry sorting to the client, with minimal server-side processing. Sure, there are additional functions, but they shouldn't be too large, and one can also drop f_findfirst()/f_findnext() support (setting FF_USE_FIND=0).
In a properly designed implementation, however, we'd write the response handler as a coroutine, so that the TCP stack, HTTP server, and/or RTOS used can call it to generate around a packet's worth of more data to send to the client, instead of the single-threaded one-client-at-a-time approach shown above. But that gets into some annoyingly complicated stuff (as in, depending on everything else that is being used) that is better discussed in a separate thread.
In case it is not obvious, my intent with these posts is not to just reply to those who replied to me, but hopefully to describe the solutions and approaches in a way that is understandable to even those who much later on stumble on this thread, looking for answers to a similar question. So, if I happen to explain something that you very well know, do not be offended: I'm not assuming you don't understand this, I'm just trying to keep things at a level that anyone interested can follow too. Answering to just one person with this kind of verbosity (which I cannot help but do) isn't much fun; but writing it in a way that it might help others too, makes it worthwhile and fun for me.