Handling files - using boost::filesystem

published at 01.09.2015 14:41 by Jens Weller
Save to Instapaper Pocket

This is the 9th part of my series about writing applications in C++ using Qt and boost. The last part was about integrating an HTML editor into Qt. As I do write my own content management system, in the last weeks I've been thinking about how to store data. As part of my data is actual files (images, css, js, files...), I realized that the best way to handle those, would be to just store and load them from the file system. In the UI Qt even offers good support for displaying a folder in an Application with QFileSystemModel. But in the application, I need these files mostly represented as collections, which are loaded at the start, and edited through the UI.

Filesystem access in C++

Until now, I have mostly used Qt or wxWidgets to access folders and files in C++, as it always was tightly coupled with UI. Selecting files and folders is in both frameworks supported out of the box. Getting the content of a directory is really easy in wxWidgets with wxDir::GetAllFiles, a static method. Qt has its QDir class, and enables file system access with QDirIterator. But none of them are an option in my back end, as I want to rely mostly on standard C++ and boost.

boost::filesystem is not only a mature and often used library for file system access, it is also the role model for std::filesystem, the TS for File System should become part of C++17. This is the first time that I use boost::filesystem for real code. The code is very important, as it creates or loads a new project from the file system.

To not always type boost::filesystem::..., I use a namespace alias:

namespace fs = boost::filesystem;

What follows is the actual code used by my Project. Lets start with creating a new project. This means that I have to create a few folders, which are the basic structure of every project in my planned CMS. In boost::filesystem the class fs::path represents a path on the file system, and in best C++ tradition, it has an overloaded operator: /. This makes the path related code very readable, and you don't have to add any (back)slashes to your different paths anymore:

DocumentTreeItem::item_t createProject(const std::string &name,const std::string& basepath, DocumentTreeItem::item_t &parent)

{

    DocumentTreeItem::item_t doc = basicProject(name,basepath,parent);

    //create basic directories

    fs::path p = basepath +"/"+ name;

    fs::create_directories(p / "web" / "css");

    fs::create_directory(p / "web" / "img");

    fs::create_directory(p / "web" / "js");

    fs::create_directory(p / "template");

    auto document = doc->get<Document>();

    document->setBasepath(p.generic_string());

    p /= "web";

    document->setWebpath(p.generic_string());

    document->getImageList()->setPath(p.generic_string() + "/img/");

    fs::copy_file(p / "../../../tinymce3/examples/editor.html", p / "editor.html");

    return doc;

}

This code creates the basic directory structure. The basicProject function is only creating the basic tree structure for the project, and doing nothing file system related. This is shared with the loadProject function. The fs::create_directories function creates all non existing directories in the path, while fs::create_directory only will create the last directory in the path. Precondition is that path is a directory. The path class can be converted to std::string with two methods: string() and generic_string(), while string() gives you the native format, generic_string() gives you the portable format. Personally, I would prefer to get the portable format through string() and have native_string() method for the native path...

The createProject function has to set up some parts of the document class, and then a file is copied via fs::copy_file. This is the editor template which needs to be copied for each project into the web directory. This is because I can't set the baseURI of the loaded editor in Qt correctly, this has never worked, and seems to fallback to the file system. For this reason also, the image, js and css folders have to be under /web/, while in the UI they are displayed alongside web.

When you are able to create projects, you also want to be able to load them again:

DocumentTreeItem::item_t loadProject(const std::string &name,const std::string &basepath, DocumentTreeItem::item_t &parent)

{
fs::path p = basepath + "/" + name; bool load_web = fs::exists(p / "data.dat"); DocumentTreeItem::item_t doc = basicProject(name,basepath,parent,!load_web); auto document = doc->get(); document->getCsslist()->setFiles(load_dir_recursive(p / "web" / "css")); document->getJslist()->setFiles(load_dir_recursive(p / "web" / "js")); document->getImageList()->setFiles(load_dir_recursive(p / "web" / "img")); document->setBasepath(p.generic_string()); return doc; }

The actual code is already a bit more advanced, and stores the non file system data in a data.dat file in the project root folder. But as the data model changes, this file often needs to be deleted, as I yet don't want to handle versioning, just for adding more data to the serialization part of my code. So, loadProject needs to check if this file exists, which again is easy with fs::exists. Also the basicProject function needs to know, if it should create the default structure of the project, including the tree nodes usually loaded later via serialization when opening a saved project. The important part of this function is to load the css, js and image files from the file system, this is done via load_dir_recursive:

boost::container::flat_set load_dir_recursive(const fs::path& path)

{

    boost::container::flat_set set;

    std::string::size_type pathsize = path.generic_string().size()+1;

    for(fs::directory_entry& entry: fs::recursive_directory_iterator(path))

        set.insert(entry.path().generic_string().substr(pathsize));

    return set;

}

All file lists are currently represented as boost::container::flat_set. Iterating over the files is really easy, fs::recursive_directory_iterator lets you iterate over a path, a fs::directory_entry represents the actual entry. I do not need the full path, I only need the local path, the substr with the path size works very well. I am not sure if filesystem has support for a better solution, relative_path() is not returning the format I need. Also I could check with fs::is_regular_file or fs::is_directory the type of the path, to only have files or directories in the list. Currently the list is filtered in the UI.

When you only want to load the content of a directory without its sub folders, you simply can use fs::directory_iterator.

One need my file list classes have, is to actually delete files when they are deleted in the UI, this is done via fs::remove. Currently not supported is renaming files, but it would be easily available via fs::rename.

More on boost::filesystem

This is my usage of boost::filesystem currently. But filesystem has support for more, it is able to give you the fs::current_path, but also resolve a local path to an absolute one via fs::system_complete. You also can query some properties, like status, last write time or file_size.

Boost filesystem also brings its own file streams, which can be used with fs::path to create, read or write files. All of the code above will throw if an error occurs, the API of filesystem always offers a non throwing alternative with adding a boost::system::error_code& parameter to the call. I prefer to have a try-catch block when calling into this code, much cleaner. Further examples and resources can be found in the filesystem chapter at theboostcpplibraries.com.

 

Join the Meeting C++ patreon community!
This and other posts on Meeting C++ are enabled by my supporters on patreon!