Propose your solution
Attachments
No attachments.
Duplicates
Comments
|
Emacs23
wrote on the 17 Oct 08 at 21:06
|
|
|
|
Of course, it would be wonderful if it would be done. But I'm afraid it's impossible. They even can't get rid of bugs in tracker. So, it's stupid to wait until they will support convenient query language.
|
|
|
|
It's not impossible. They just don't give the tracker the love it needs ;)
|
andruk
(Idea reviewer)
wrote on the 18 Oct 08 at 09:18
|
|
|
If Tracker isn't a viable option, then use Beagle, or something that works! This is FOSS, and if there is one thing I have learned about FOSS it is that there is never a single solution. Never ever ever.
However, one of the recurring problems in FOSS is that there are sometimes many projects of roughly equal mediocrity.
+1
|
|
_alex_
wrote on the 20 Oct 08 at 05:34
|
|
|
Emacs23, I know what you mean. The state of tracker is, quite frankly, abysmal. It's nigh useless because of bugs (I've omitted discussion of them, because my original idea was locked for that very reason), but also because of lack of proper integration with the Gnome desktop (the search tool has next to 0 options to narrow down results to the actually relevant ones, and most of tracker's supposed power is not exposed to the user anywhere else either).
Frustration is what prompted this series of ideas. Originally I planned to lampoon the many "WTF moments" of actually using Tracker search, but realized that my ideas would get locked if I did. So instead I try to be constructive... or rather wishful of what should be...
Anyway, for an even more outrageous idea, read Part 3! :)
|
|
Endolith
wrote on the 20 Oct 08 at 16:08
|
|
|
Don't base anything on Tracker. It doesn't work well and shouldn't be a dependency for this functionality to be used by other programs.
Build the metadata/tags into a low-level subsystem that any program can use on any desktop environment on any file system:
http://brainstorm.ubuntu.com/idea/9560/
'typing the following into Nautilus' search bar: "*.mp3 rating:5".'
No. That would only find mp3 files. What about music stored in oggs and wav files? What about mp3 files that contain system sounds or audiobooks instead of music? "music rating:5" would be better, with all musical files of any format in the "music" tag. But then things that are not music files but are related to music might be tagged "music", too? :) It gets complicated.
Look at "triple tags" for a possible implementation, though I think it is not very well-defined:
http://www.foo.be/cgi-bin/wiki.pl/MachineTag
|
|
_alex_
wrote on the 20 Oct 08 at 19:21
|
|
|
>"No. That would only find mp3 files. What about music stored in oggs and wav files?"
Ha! I knew someone would call me out on this by pointing to ogg :)
In fact, I did consider that not all audio is in mp3s. Though for simplicity I opted to give a basic example and let the reader use their imagination for how to improve on it. If you've read my previous idea, this is how I would do it: "filetype:music, rating:5". Which is similar to what you're suggesting.
>"But then things that are not music files but are related to music might be tagged "music", too? :) It gets complicated."
That's why there should be a distinction between user assignable tags, and metadata describing the filetype (e.g. "audio, music" for mp3s and oggs).
>"What about mp3 files that contain system sounds or audiobooks instead of music?"
In most cases you wouldn't index outside your home directory (or hidden config dirs). And even if you did, you could add a qualifier such as "location:/home", which would search only below home (recursively). It gets slightly more elaborate quickly if you start to consider some other use cases, so I just left it out in my discussion.
As for jettisoning Tracker (or at least avoiding it): I'm certainly not opposed to that, though then we need something better (Beagle is not perfect either).
Finally, a taggable file system is not sufficient. A file system build around a metadata db (such as dbfs) could certainly be the future of filesystems, though if you've been following the kernel mailing list, it's pretty evident that linux is moving towards btrfs in the semi-near future anyway...
|
|
Endolith
wrote on the 20 Oct 08 at 21:58
|
|
|
"In most cases you wouldn't index outside your home directory"
Audiobooks would be in mp3 files inside your home directory with ID3 tags and everything else, but should not show up in a music player. That's why we need tags like "music" and "audiobook", and not just machine tags like "filetype:audio".
|
|
Endolith
wrote on the 20 Oct 08 at 22:45
|
|
|
"Finally, a taggable file system is not sufficient."
What is it missing?
The way I'm imagining it, what you call "metadata" would be accessible as "tags", in the same way as user-generated labels/tags. For instance, searching for "filetype:mp3" would show you all mp3s, while searching for "music" would show you all music files, but the "filetype:mp3" tag would be permanently, automatically attached to the file by the subsystem.
Higher level tags like "artist:Madonna" would be automatically created from the ID3 tags when the file is created or when the ID3 tags are changed, but could still be changed manually...
"A file system build around a metadata db (such as dbfs) could certainly be the future of filesystems, though if you've been following the kernel mailing list, it's pretty evident that linux is moving towards btrfs in the semi-near future anyway..."
People use many different filesystems, though. I think any good solution needs to be built on top of the file system, so that people can still index things on their FAT USB drives, etc. But it should be below things like Tracker, so that it is not app-specific and can be used by all apps in any desktop. Probably some kind of hidden directories or hidden files or something...
|
|
_alex_
wrote on the 21 Oct 08 at 00:56
|
|
|
>Audiobooks would be in mp3 files inside your home directory with ID3 tags and everything else, but should not show up in a music player. That's why we need tags like "music" and "audiobook", and not just machine tags like "filetype:audio".
Fair enough, to make such a fine distinction you may need user tags, though audio books might be identifiable as such from the ID3 tags (and hence would also be in the metadata db).
>What is it missing?
>The way I'm imagining it, what you call "metadata" would be accessible as "tags", in the same way as user-generated labels/tags. For instance, searching for "filetype:mp3" would show you all mp3s, while searching for "music" would show you all music files, but the "filetype:mp3" tag would be permanently, automatically attached to the file by the subsystem.
>Higher level tags like "artist:Madonna" would be automatically created from the ID3 tags when the file is created or when the ID3 tags are changed, but could still be changed manually...
What's missing is an actual database. What you suggest is overloading tags to wedge all kinds of different data into them. This is *bad*! Why? Let's see: filetypes, artist... what else? Oh: date received, oh wait no that's a date... probably shouldn't be stored as a string... How about ratings? No, can't do math operations on strings efficiently. Example "rating:>3". So now you may say let's have different tag types! Well good thinking! That's what a relational database is, and more :)
Bottom line: If you cram all of this information into tags, your search will no longer be fast because of wasted space (do you really need to store "filetype:" prefix for every filetype tuple? Answer: probably not :P) and collisions/unbalanced tree. So, as I said: What's missing is an actual database. Sure, you can tack one ontop of tags, but it's not any different from what Tracker does (or tries to do) with a conventional filesystem.
>People use many different filesystems, though. I think any good solution needs to be built on top of the file system, so that people can still index things on their FAT USB drives, etc. But it should be below things like Tracker, so that it is not app-specific and can be used by all apps in any desktop. Probably some kind of hidden directories or hidden files or something...
Oh I agree with that 100%. It's you that suggested it as a possible solution here: http://brainstorm.ubuntu.com/idea/9560/
It should in fact just be a metadata database on top of any filesystem (with the filesystem providing the necessary hooks to track file management). At least that's what I think, and have described here. However, some *cough*WinFS*cough* people may disagree.
Tracker does try to be a metadata repository and search API. Though it does the job very poorly in its current state. As I said before, I wouldn't be opposed to using another set of tools, if they are better. However, so far I've not seen anybody point me to a working solution with the exception of Nepomuk. Which I dare say I may start contributing to, because it is *exactly* what I envision the future to look like :)
|
|
_alex_
wrote on the 21 Oct 08 at 01:07
|
|
|
'Oh I agree with that 100%. It's you that suggested it as a possible solution here: http://brainstorm.ubuntu.com/idea/9560/"
I was referring to dbfs (not Tracker) in that sentence. I realized it wasn't clear as I only meant to quote the first two sentences of Endolith :)
|
|
_alex_
wrote on the 21 Oct 08 at 01:12
|
|
|
|
Also remember that tags would still be available with a central metadata database. That is, tags are a subset of the metadata that the DB will store, and allow searching on, but tags alone do not substitute well for the lack of a real *relational* database.
|
|
Endolith
wrote on the 21 Oct 08 at 14:42
|
|
|
Fair enough, to make such a fine distinction you may need user tags, though audio books might be identifiable as such from the ID3 tags (and hence would also be in the metadata db).
Similarly, you'd want to differentiate photos from diagrams even though both are stored in jpeg, podcasts from music even though both are stored in mp3, instructional videos from movies, even though both are stored in mpg. On the other hand, you might want both music videos and music audio files to show up in your media player as the default "music" playlist.
I really don't like the Microsoft-esque categorization system of dividing things up by filetype. "My Music", "My Pictures", "My Videos", "My E-books", and so on. It should be organized with meaningful tags like "Music", "Podcasts", "Photos", "Diagrams", "Movies", "Music videos", "Textbooks", "Family vacation 2008", etc.
Oh: date received, oh wait no that's a date... probably shouldn't be stored as a string...
Why not? ISO format is a standardized, computer readable way to represent dates as strings, and could be parsed by tools to be represented in whatever format you want.
do you really need to store "filetype:" prefix for every filetype tuple? Answer: probably not :P
No, but you should be able to search for files as if they were stored that way. Like ID3 tags, photos also have embedded EXIF data that should be searchable, too, for instance. So some of the metadata would be stored inside the file itself, and some of it would be in an external database?
So, as I said: What's missing is an actual database. Sure, you can tack one ontop of tags, but it's not any different from what Tracker does (or tries to do) with a conventional filesystem.
I didn't say anything about not using a database as the backend. I think we are talking about the same thing with different words. I have seen a few proposals for using file links and directories and such to store the information (tagsistant), but I don't think that would be sufficient. But it needs to be something stored along with the files, not in a separate database that is unaffected when the files' "real" paths are moved, unmounted, etc.
Tracker does try to be a metadata repository and search API. Though it does the job very poorly in its current state. As I said before, I wouldn't be opposed to using another set of tools, if they are better.
I think the database should be a dedicated tool and Tracker should use the information from it instead of implementing its own system.
However, so far I've not seen anybody point me to a working solution with the exception of Nepomuk. Which I dare say I may start contributing to, because it is *exactly* what I envision the future to look like :)
I will look into it.
So now you may say let's have different tag types! Well good thinking! That's what a relational database is, and more :)
Also remember that tags would still be available with a central metadata database. That is, tags are a subset of the metadata that the DB will store, and allow searching on, but tags alone do not substitute well for the lack of a real *relational* database.
Yes. See http://weblog.scifihifi.com/2005/08/05/meta-tags-the-poor-mans-rdf/
|
|
_alex_
wrote on the 21 Oct 08 at 16:26
|
|
|
Now we're just talking semantics. We certainly agree that tags are quite a useful tool for organizing files. Our differences come because I just stop there, but you propose overloading tags as in the poor man's rdf article.
I just think that's a bad solution because it is limiting (e.g. math operations on tag ratings).
Also you have to keep in mind that you need to organize the metadata into an efficient data structure for fast searches. So you can't just keep some of it in the files. You need to index it, if you want to search on it quickly.
So at the end of the day you have no choice but to detach (or rather copy) the metadata from the files in order to perform fast searches.
The implementations (e.g. Tracker and Beagle) so far have been poor, because they do a poor job of tracking file management. Even though the filesystem does provide all the necessary hooks for tracking (e.g., file created, file renamed, file moved events, etc).
Also remember, that I'm not only talking about files in my idea, but also Email, Pidgin conversations etc.
We both want more power to organize and search our data, I think we just differ on the implementation. So this is my last comment on my perspective :)
|
|
Endolith
wrote on the 21 Oct 08 at 18:56
|
|
|
Our differences come because I just stop there, but you propose overloading tags as in the poor man's rdf article.
I'm not sure what the difference is between "overloading tags" and accessing metadata with searches.
So you can't just keep some of it in the files. You need to index it, if you want to search on it quickly.
That's a good point. So the database should copy the metadata (ID3, EXIF) out of the files and index it for fast searches. But then it has to keep track of whether that data changes and keep it updated? And those values in the database should not editable by the user, but user-created tags should be? To change the artist of an mp3 file, you should have to edit the file's ID3 tag itself and then the database will be updated automatically?
So at the end of the day you have no choice but to detach (or rather copy) the metadata from the files in order to perform fast searches.
True. I'd really like the metadata to be carried around with the files, though, and not wholly separate.
I want to be able to carry files from home to work on a USB stick, and have them in "the same place" (can be found with the same queries or tags) as they are at home. I want to be able to send files to other people with the metadata already attached. Maybe that would necessitate sending as .tar.gz files with included helper files or something, in the same way that you would have to zip something if you wanted to preserve paths on another machine.
Maybe the act of mounting a filesystem would index all of its metadata into the local database?
We both want more power to organize and search our data, I think we just differ on the implementation. So this is my last comment on my perspective :)
I don't think we differ on implementation. I am trying to figure out what the implementation would be.
|
andruk
(Idea reviewer)
wrote on the 10 May 09 at 09:11
|
|
|
|
The filesystem can notify a program (like Beagle, Tracker, or Nepomuk) the last modified date on a file changes. Then the database can rescan the file.
|
Post your comment
|