Ubuntu QA:
BlogBrainstormPackage status
Log in
Ubuntu QA
The Ubuntu community has contributed 21986 ideas, 135057 comments, 2615221 votes
Idea sandbox Idea sandbox
Popular ideas Popular ideas
Ideas in development Ideas in development
Implemented ideas Implemented ideas
Idea #14506: Ubuntu Search Part 2 - Tracker Metadata in Nautilus

Written by _alex_ the 17 Oct 08 at 20:16. Related project: Nautilus. Status: New
Rationale
PLEASE TAKE THE TIME TO READ FULLY BEFORE VOTING

This is part 2 of an epic series of ideas on how to make search in Ubuntu not suck. For part 1 click here. In this idea we look at how to integrate Tracker metadata into Nautilus (the file browser).

When a file in Nautilus is selected, the status bar should expand to show information about the file, including editable tracker metadata. The matadata could be context sensitive, and include tags, as well as ratings, artist, and genre information for music, author and title for PDFs, etc.

Some of this metadata should be directly editable in the status bar. (I say *some* because it makes sense for ratings, but not song play counts etc).

As discussed in the previous idea, exposing the metadata to the user in an intuitive manner will greatly enhance usability. For example: without the use of any music player, one could generate a playlist of top rated songs simply by typing the following into Nautilus' search bar: "*.mp3 rating:5". That is, list all songs with a rating of 5 stars. And remember that songs can be rated directly in Nautilus!

If the idea of smart search folders is incorporated with this, one could simply set up virtual search folders for top 10 played songs, recently accessed files of filetype:pdf, etc. Just spend some time thinking about various scenarios where this would be useful to you. Imagine the organizational power that you get at your finger tips! And how intuitive this would be!

In the next idea in this series we'll look at further integrating Tracker with applications. Stay tuned :)

Read and vote on Part 1 and Part 3 here!

37
votes
up equal down
Solution #1: Edit in the Status bar
Written by _alex_ the 17 Oct 08 at 20:16.
When a file in Nautilus is selected, the status bar should expand to show information about the file, including editable tracker metadata. The matadata could be context sensitive, and include tags, as well as ratings, artist, and genre information for music, author and title for PDFs, etc.

Some of this metadata should be directly editable in the status bar. (I say *some* because it makes sense for ratings, but not song play counts etc).

As discussed in the previous idea, exposing the metadata to the user in an intuitive manner will greatly enhance usability. For example: without the use of any music player, one could generate a playlist of top rated songs simply by typing the following into Nautilus' search bar: "*.mp3 rating:5". That is, list all songs with a rating of 5 stars. And remember that songs can be rated directly in Nautilus!

If the idea of smart search folders is incorporated with this, one could simply set up virtual search folders for top 10 played songs, recently accessed files of filetype:pdf, etc. Just spend some time thinking about various scenarios where this would be useful to you. Imagine the organizational power that you get at your finger tips! And how intuitive this would be!

Propose your solution

Attachments
No attachments.


Duplicates


Comments
Emacs23 wrote on the 17 Oct 08 at 21:06
Of course, it would be wonderful if it would be done. But I'm afraid it's impossible. They even can't get rid of bugs in tracker. So, it's stupid to wait until they will support convenient query language.

ajjeckmans wrote on the 18 Oct 08 at 08:36
It's not impossible. They just don't give the tracker the love it needs ;)

andruk (Idea reviewer) wrote on the 18 Oct 08 at 09:18
If Tracker isn't a viable option, then use Beagle, or something that works! This is FOSS, and if there is one thing I have learned about FOSS it is that there is never a single solution. Never ever ever.

However, one of the recurring problems in FOSS is that there are sometimes many projects of roughly equal mediocrity.

+1

_alex_ wrote on the 20 Oct 08 at 05:34
Emacs23, I know what you mean. The state of tracker is, quite frankly, abysmal. It's nigh useless because of bugs (I've omitted discussion of them, because my original idea was locked for that very reason), but also because of lack of proper integration with the Gnome desktop (the search tool has next to 0 options to narrow down results to the actually relevant ones, and most of tracker's supposed power is not exposed to the user anywhere else either).

Frustration is what prompted this series of ideas. Originally I planned to lampoon the many "WTF moments" of actually using Tracker search, but realized that my ideas would get locked if I did. So instead I try to be constructive... or rather wishful of what should be...

Anyway, for an even more outrageous idea, read Part 3! :)

Endolith wrote on the 20 Oct 08 at 16:08
Don't base anything on Tracker. It doesn't work well and shouldn't be a dependency for this functionality to be used by other programs.

Build the metadata/tags into a low-level subsystem that any program can use on any desktop environment on any file system:

http://brainstorm.ubuntu.com/idea/9560/

'typing the following into Nautilus' search bar: "*.mp3 rating:5".'

No. That would only find mp3 files. What about music stored in oggs and wav files? What about mp3 files that contain system sounds or audiobooks instead of music? "music rating:5" would be better, with all musical files of any format in the "music" tag. But then things that are not music files but are related to music might be tagged "music", too? :) It gets complicated.

Look at "triple tags" for a possible implementation, though I think it is not very well-defined:

http://www.foo.be/cgi-bin/wiki.pl/MachineTag

_alex_ wrote on the 20 Oct 08 at 19:21
>"No. That would only find mp3 files. What about music stored in oggs and wav files?"

Ha! I knew someone would call me out on this by pointing to ogg :)
In fact, I did consider that not all audio is in mp3s. Though for simplicity I opted to give a basic example and let the reader use their imagination for how to improve on it. If you've read my previous idea, this is how I would do it: "filetype:music, rating:5". Which is similar to what you're suggesting.

>"But then things that are not music files but are related to music might be tagged "music", too? :) It gets complicated."
That's why there should be a distinction between user assignable tags, and metadata describing the filetype (e.g. "audio, music" for mp3s and oggs).

>"What about mp3 files that contain system sounds or audiobooks instead of music?"

In most cases you wouldn't index outside your home directory (or hidden config dirs). And even if you did, you could add a qualifier such as "location:/home", which would search only below home (recursively). It gets slightly more elaborate quickly if you start to consider some other use cases, so I just left it out in my discussion.

As for jettisoning Tracker (or at least avoiding it): I'm certainly not opposed to that, though then we need something better (Beagle is not perfect either).

Finally, a taggable file system is not sufficient. A file system build around a metadata db (such as dbfs) could certainly be the future of filesystems, though if you've been following the kernel mailing list, it's pretty evident that linux is moving towards btrfs in the semi-near future anyway...

Endolith wrote on the 20 Oct 08 at 21:58
"In most cases you wouldn't index outside your home directory"

Audiobooks would be in mp3 files inside your home directory with ID3 tags and everything else, but should not show up in a music player. That's why we need tags like "music" and "audiobook", and not just machine tags like "filetype:audio".

Endolith wrote on the 20 Oct 08 at 22:45
"Finally, a taggable file system is not sufficient."

What is it missing?

The way I'm imagining it, what you call "metadata" would be accessible as "tags", in the same way as user-generated labels/tags. For instance, searching for "filetype:mp3" would show you all mp3s, while searching for "music" would show you all music files, but the "filetype:mp3" tag would be permanently, automatically attached to the file by the subsystem.

Higher level tags like "artist:Madonna" would be automatically created from the ID3 tags when the file is created or when the ID3 tags are changed, but could still be changed manually...

"A file system build around a metadata db (such as dbfs) could certainly be the future of filesystems, though if you've been following the kernel mailing list, it's pretty evident that linux is moving towards btrfs in the semi-near future anyway..."

People use many different filesystems, though. I think any good solution needs to be built on top of the file system, so that people can still index things on their FAT USB drives, etc. But it should be below things like Tracker, so that it is not app-specific and can be used by all apps in any desktop. Probably some kind of hidden directories or hidden files or something...

_alex_ wrote on the 21 Oct 08 at 00:56
>Audiobooks would be in mp3 files inside your home directory with ID3 tags and everything else, but should not show up in a music player. That's why we need tags like "music" and "audiobook", and not just machine tags like "filetype:audio".

Fair enough, to make such a fine distinction you may need user tags, though audio books might be identifiable as such from the ID3 tags (and hence would also be in the metadata db).

>What is it missing?
>The way I'm imagining it, what you call "metadata" would be accessible as "tags", in the same way as user-generated labels/tags. For instance, searching for "filetype:mp3" would show you all mp3s, while searching for "music" would show you all music files, but the "filetype:mp3" tag would be permanently, automatically attached to the file by the subsystem.
>Higher level tags like "artist:Madonna" would be automatically created from the ID3 tags when the file is created or when the ID3 tags are changed, but could still be changed manually...

What's missing is an actual database. What you suggest is overloading tags to wedge all kinds of different data into them. This is *bad*! Why? Let's see: filetypes, artist... what else? Oh: date received, oh wait no that's a date... probably shouldn't be stored as a string... How about ratings? No, can't do math operations on strings efficiently. Example "rating:>3". So now you may say let's have different tag types! Well good thinking! That's what a relational database is, and more :)

Bottom line: If you cram all of this information into tags, your search will no longer be fast because of wasted space (do you really need to store "filetype:" prefix for every filetype tuple? Answer: probably not :P) and collisions/unbalanced tree. So, as I said: What's missing is an actual database. Sure, you can tack one ontop of tags, but it's not any different from what Tracker does (or tries to do) with a conventional filesystem.

>People use many different filesystems, though. I think any good solution needs to be built on top of the file system, so that people can still index things on their FAT USB drives, etc. But it should be below things like Tracker, so that it is not app-specific and can be used by all apps in any desktop. Probably some kind of hidden directories or hidden files or something...

Oh I agree with that 100%. It's you that suggested it as a possible solution here: http://brainstorm.ubuntu.com/idea/9560/

It should in fact just be a metadata database on top of any filesystem (with the filesystem providing the necessary hooks to track file management). At least that's what I think, and have described here. However, some *cough*WinFS*cough* people may disagree.

Tracker does try to be a metadata repository and search API. Though it does the job very poorly in its current state. As I said before, I wouldn't be opposed to using another set of tools, if they are better. However, so far I've not seen anybody point me to a working solution with the exception of Nepomuk. Which I dare say I may start contributing to, because it is *exactly* what I envision the future to look like :)

_alex_ wrote on the 21 Oct 08 at 01:07
'Oh I agree with that 100%. It's you that suggested it as a possible solution here: http://brainstorm.ubuntu.com/idea/9560/"

I was referring to dbfs (not Tracker) in that sentence. I realized it wasn't clear as I only meant to quote the first two sentences of Endolith :)

_alex_ wrote on the 21 Oct 08 at 01:12
Also remember that tags would still be available with a central metadata database. That is, tags are a subset of the metadata that the DB will store, and allow searching on, but tags alone do not substitute well for the lack of a real *relational* database.

Endolith wrote on the 21 Oct 08 at 14:42
Fair enough, to make such a fine distinction you may need user tags, though audio books might be identifiable as such from the ID3 tags (and hence would also be in the metadata db).

Similarly, you'd want to differentiate photos from diagrams even though both are stored in jpeg, podcasts from music even though both are stored in mp3, instructional videos from movies, even though both are stored in mpg. On the other hand, you might want both music videos and music audio files to show up in your media player as the default "music" playlist.

I really don't like the Microsoft-esque categorization system of dividing things up by filetype. "My Music", "My Pictures", "My Videos", "My E-books", and so on. It should be organized with meaningful tags like "Music", "Podcasts", "Photos", "Diagrams", "Movies", "Music videos", "Textbooks", "Family vacation 2008", etc.

Oh: date received, oh wait no that's a date... probably shouldn't be stored as a string...

Why not? ISO format is a standardized, computer readable way to represent dates as strings, and could be parsed by tools to be represented in whatever format you want.

do you really need to store "filetype:" prefix for every filetype tuple? Answer: probably not :P

No, but you should be able to search for files as if they were stored that way. Like ID3 tags, photos also have embedded EXIF data that should be searchable, too, for instance. So some of the metadata would be stored inside the file itself, and some of it would be in an external database?

So, as I said: What's missing is an actual database. Sure, you can tack one ontop of tags, but it's not any different from what Tracker does (or tries to do) with a conventional filesystem.

I didn't say anything about not using a database as the backend. I think we are talking about the same thing with different words. I have seen a few proposals for using file links and directories and such to store the information (tagsistant), but I don't think that would be sufficient. But it needs to be something stored along with the files, not in a separate database that is unaffected when the files' "real" paths are moved, unmounted, etc.

Tracker does try to be a metadata repository and search API. Though it does the job very poorly in its current state. As I said before, I wouldn't be opposed to using another set of tools, if they are better.

I think the database should be a dedicated tool and Tracker should use the information from it instead of implementing its own system.

However, so far I've not seen anybody point me to a working solution with the exception of Nepomuk. Which I dare say I may start contributing to, because it is *exactly* what I envision the future to look like :)

I will look into it.

So now you may say let's have different tag types! Well good thinking! That's what a relational database is, and more :)

Also remember that tags would still be available with a central metadata database. That is, tags are a subset of the metadata that the DB will store, and allow searching on, but tags alone do not substitute well for the lack of a real *relational* database.

Yes. See http://weblog.scifihifi.com/2005/08/05/meta-tags-the-poor-mans-rdf/

_alex_ wrote on the 21 Oct 08 at 16:26
Now we're just talking semantics. We certainly agree that tags are quite a useful tool for organizing files. Our differences come because I just stop there, but you propose overloading tags as in the poor man's rdf article.

I just think that's a bad solution because it is limiting (e.g. math operations on tag ratings).

Also you have to keep in mind that you need to organize the metadata into an efficient data structure for fast searches. So you can't just keep some of it in the files. You need to index it, if you want to search on it quickly.

So at the end of the day you have no choice but to detach (or rather copy) the metadata from the files in order to perform fast searches.

The implementations (e.g. Tracker and Beagle) so far have been poor, because they do a poor job of tracking file management. Even though the filesystem does provide all the necessary hooks for tracking (e.g., file created, file renamed, file moved events, etc).

Also remember, that I'm not only talking about files in my idea, but also Email, Pidgin conversations etc.

We both want more power to organize and search our data, I think we just differ on the implementation. So this is my last comment on my perspective :)

Endolith wrote on the 21 Oct 08 at 18:56
Our differences come because I just stop there, but you propose overloading tags as in the poor man's rdf article.

I'm not sure what the difference is between "overloading tags" and accessing metadata with searches.

So you can't just keep some of it in the files. You need to index it, if you want to search on it quickly.

That's a good point. So the database should copy the metadata (ID3, EXIF) out of the files and index it for fast searches. But then it has to keep track of whether that data changes and keep it updated? And those values in the database should not editable by the user, but user-created tags should be? To change the artist of an mp3 file, you should have to edit the file's ID3 tag itself and then the database will be updated automatically?

So at the end of the day you have no choice but to detach (or rather copy) the metadata from the files in order to perform fast searches.

True. I'd really like the metadata to be carried around with the files, though, and not wholly separate.

I want to be able to carry files from home to work on a USB stick, and have them in "the same place" (can be found with the same queries or tags) as they are at home. I want to be able to send files to other people with the metadata already attached. Maybe that would necessitate sending as .tar.gz files with included helper files or something, in the same way that you would have to zip something if you wanted to preserve paths on another machine.

Maybe the act of mounting a filesystem would index all of its metadata into the local database?

We both want more power to organize and search our data, I think we just differ on the implementation. So this is my last comment on my perspective :)

I don't think we differ on implementation. I am trying to figure out what the implementation would be.

andruk (Idea reviewer) wrote on the 10 May 09 at 09:11
The filesystem can notify a program (like Beagle, Tracker, or Nepomuk) the last modified date on a file changes. Then the database can rescan the file.


Post your comment