Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -78,15 +78,16 @@ include_directories(
# Library target
#
set(VS_XML_SOURCES
lib/private/impl.cpp
lib/private/wrp-impl.cpp
lib/archive.cpp
lib/parser.cpp
lib/serializer.cpp
lib/tree.cpp
lib/document.cpp
lib/tree-builder.cpp
lib/query.cpp
lib/query-builder.cpp
lib/node.cpp
lib/wrp-node.cpp
)

add_library(vs-xml
Expand Down
17 changes: 13 additions & 4 deletions RELEASE.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,18 @@
New release, fresh out of the oven. Many of the improvements are not reported in here, as they are mostly infrastructural and will not impact the end user; please check commits if you want more details.
The main focus was to extend the current interface with useful utilities, remove even more dynamic allocations when not needed, and fix some downstream issues when linking this library.

## Breaking

- `DocBuilder` renamed as `DocumentBuilder` to be more consistent in naming.
- Removed `path` functions from XML entities, leftovers from the very early versions;
this functionality can now be trivially replaced by user-defined functions, since the rest of the interface is complete.
- The binary interface for trees and derived friends changed. Again. But it is for good reasons! We optimized away one of the biggest fields in nodes, saving a significant amount of memory.
Technically this change prevents out of order nodes in the memory layout, but this was just a side-effect extra, not something intended.

## New features

## Features
Introduced `TreeRaw::visit` and `Tree::visit` to implement a more flexible visitor pattern when compared to the one recently added iterator-based approach.
They are both based on `private/(wrp-)visit.hpp`, which is not publicly exposed (for now).

Introduced `Tree::visit` and `TreeRaw::visit` to implement a slightly different visitor pattern compared to the recently added iterators.
Introduced a new `print2` function for trees and derived siblings, to provide a not recursive variant of `print` which does not grow on stack based on the depth of the tree.
Not tested yet, but it will deprecate `print`.
New `print` functions have been introduced for trees, based on the visitor pattern. It no longer uses `std::print` due to the awful overhead and additional memory allocations. `fmt` had no such issue to be honest.
The legacy version has been optimized as well: it is now called `print_fast` and still uses simple recursion to get a signifiant edge on performance; however, be mindful of stack overflows if working with stack intensive applications or badly nested trees.
1 change: 1 addition & 0 deletions TODO.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
- [ ] Deprecate this file plz.
- [ ] Random access to attributes for the iterator.
- [ ] Tree builder method to use injection maps when generating the tree.
- [ ] External visitor interface to Tree/TreeRaw. This allows for more flexibility and templating.

## Query redesign

Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.2.12
0.2.13
11 changes: 5 additions & 6 deletions benchmark/src/serialize-big.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@

int test_vs(std::string_view xmlInput){
try{
xml::DocumentBuilder<{.symbols=xml::builder_config_t::EXTERN_REL,.raw_strings=true}> bld(xmlInput);
xml::Parser parser(xmlInput, bld);
VS_XML_NS::DocumentBuilder<{.symbols=VS_XML_NS::builder_config_t::EXTERN_REL,.raw_strings=true}> bld(xmlInput);
VS_XML_NS::Parser parser(xmlInput, bld);
std::ignore = parser.parse();

auto tree = bld.close();
Expand All @@ -29,7 +29,7 @@ int test_vs(std::string_view xmlInput){
std::string str;
std::stringstream file(str);

tree->print(file);
tree->print_fast(file);

}catch (const std::exception &ex) {
std::cerr << "Error while testing: " << ex.what() << "\n";
Expand All @@ -45,13 +45,13 @@ int test_vs2(std::string_view binInput){
std::span<const uint8_t> binInput((const uint8_t*)mmap.data(),mmap.size());


auto tree =xml::Document::from_binary(binInput);
auto tree =VS_XML_NS::Document::from_binary(binInput);

std::string str;
std::stringstream file(str);

//tree.print(file);
tree->save_binary(file);
tree->print_fast(file);

}catch (const std::exception &ex) {
std::cerr << "Error while testing: " << ex.what() << "\n";
Expand Down Expand Up @@ -92,7 +92,6 @@ int main(int argc, const char* argv[]) {

mio::mmap_source mmap3("./assets/nasa_10_f_bs.xml.bin");
std::string_view binInput(mmap3.data(),mmap3.size());

for(int i = 0; i<3; i++){
std::vector<decltype(std::chrono::system_clock::now())> ticks;
ticks.push_back(std::chrono::system_clock::now());
Expand Down
4 changes: 2 additions & 2 deletions docs/embedded.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,13 @@ It is important to enable the `noexcept` flag and disable `utils` alongside any
- `TreeRaw`/`Tree` general usage
- `DocumentRaw`/`Document` general usage
- `ArchiveRaw`/`Archive` general usage
- The XML parser when `.raw_strings=true`, however wraps builders which are not fully optimized yet.
- The XML parser when `.raw_strings=true`, however it wraps builders which are not fully optimized yet.
- The XML serializer when `.raw_strings=true`.
- Memos/notes/indices can all be implemented externally, as long as you have a proper library for containers `vs.xml` will not get in your way.

### 🟠 Features planned for embedded
- `TreeBuilder`, `DocumentBuilder`, `ArchiveBuilder` & `QueryBuilder`. Right now they encapsulate their own storage, unable to just work on externally defined containers, so we cannot externally handle memory allocations.
It is possible to reserve space and so limiting the number of allocations, but they cannot be fully removed as it is.
- The XML serializer when `.raw_strings=true`, it is still using functions which are not optimized, but their replacement has been implemented already. It also assumes to operate on a stream which is not great.
- Queries. Right now they are not good due to the high number of dynamic allocations needed. They could be trivially removed for the most part, but the whole system is being refactored to be stack-based and consume less memory overall.

### 🔴 Features not planned for embedded
Expand Down
15 changes: 15 additions & 0 deletions docs/releases/v0.2.13.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
## Breaking

- `DocBuilder` renamed as `DocumentBuilder` to be more consistent in naming.
- Removed `path` functions from XML entities, leftovers from the very early versions;
this functionality can now be trivially replaced by user-defined functions, since the rest of the interface is complete.
- The binary interface of the tree changed. Again. But it is for good reasons! We optimized away one of the biggest fields in nodes, saving a significant amount of memory.
Technically this change prevents out of order nodes in the memory layout, but this was just a side-effect extra, not something intended.

## Features

Introduced `TreeRaw::visit` and `Tree::visit` to implement a more flexible visitor pattern when compared to the one recently added iterator-based approach.
They are both based on `private/(wrp-)visit.hpp`, which is not publicly exposed (for now).

New `print` functions have been introduced for trees, based on the visitor pattern. It no longer uses `std::print` due to the awful overhead and additional memory allocations. `fmt` had no such issue to be honest.
The legacy version has been optimized as well: it is now called `print_fast` and still uses simple recursion to get a signifiant edge on performance; however, be mindful of stack overflows if working with stack intensive applications or badly nested trees.
13 changes: 10 additions & 3 deletions include/vs-xml/commons.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@


#include <span>
#include <string>
#include <string_view>

#include <bit>
Expand Down Expand Up @@ -175,10 +174,20 @@ struct marker_t;
struct root_t;
struct unknown_t;

struct node_iterator;
struct attr_iterator;
struct text_iterator;
struct visitor_iterator;

namespace wrp{
template <typename T>
struct base_t;
struct sv;

struct node_iterator;
struct attr_iterator;
struct text_iterator;
struct visitor_iterator;
}

namespace details{
Expand Down Expand Up @@ -234,8 +243,6 @@ concept thing_i = requires(T self){
{self.has_parent()} -> std::same_as<bool>;
{self.has_prev()} -> std::same_as<bool>;
{self.has_next()} -> std::same_as<bool>;

{self.path()} -> std::same_as<std::string>;
};

//TODO: specialization of Builder_t or just remove it?
Expand Down
57 changes: 21 additions & 36 deletions include/vs-xml/document.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,12 @@
*
*/

#include <expected>
#include <vs-xml/commons.hpp>
#include <vs-xml/tree.hpp>
#include <vs-xml/tree-builder.hpp>
#include <expected>

#include <vs-xml/commons.hpp>
#include <vs-xml/tree.hpp>
#include <vs-xml/tree-builder.hpp>
#include <vs-xml/node.hpp>

namespace VS_XML_NS{

Expand All @@ -31,42 +33,25 @@ struct DocumentBuilder;
struct DocumentRaw : TreeRaw {
using TreeRaw::TreeRaw;

inline bool print(std::ostream& out, const print_cfg_t& cfg = {})const{
for(auto& it: TreeRaw::root().children()){
if(!print_h(out, cfg, &it))return false;
}
return true;
}
bool print(std::ostream& out, const print_cfg_t& cfg = {})const;
bool print_fast(std::ostream& out, const print_cfg_t& cfg = {})const;

/**
* @brief Return the root of the proper tree inside the document (if present)
*
* @return std::optional<node_iterator>
*/
inline std::optional<node_iterator> tree_root() const{
auto c = TreeRaw::root().children();
auto it = std::ranges::find_if(c,[](auto e)static{return e.type()==type_t::ELEMENT;});
if(it!=c.end()) return it;
return {};
}

[[nodiscard]] static inline std::expected<DocumentRaw,TreeRaw::from_binary_error_t> from_binary(std::span<uint8_t> region){
std::expected<TreeRaw, TreeRaw::from_binary_error_t> t = TreeRaw::from_binary(region);
if(!t.has_value())return std::unexpected(t.error());
else return DocumentRaw(std::move(*t));
}
[[nodiscard]] static inline const std::expected<const DocumentRaw,TreeRaw::from_binary_error_t> from_binary(std::span<const uint8_t> region){
std::expected<const TreeRaw, TreeRaw::from_binary_error_t> t = TreeRaw::from_binary(region);
if(!t.has_value())return std::unexpected(t.error());
else return DocumentRaw(std::move(*t));
}
[[nodiscard]] std::optional<node_iterator> tree_root() const;

[[nodiscard]] static std::expected<DocumentRaw,TreeRaw::from_binary_error_t> from_binary(std::span<uint8_t> region);
[[nodiscard]] static const std::expected<const DocumentRaw,TreeRaw::from_binary_error_t> from_binary(std::span<const uint8_t> region);

template<builder_config_t cfg>
friend struct DocumentBuilder;

//TODO: Replace with proper prototypes, and incapsulate the mv mechanism away as it is an implementation detail, not semantically correct.
DocumentRaw(TreeRaw&& src):TreeRaw(src){}
DocumentRaw(const TreeRaw&& src):TreeRaw(src){}
DocumentRaw(TreeRaw&& src);
DocumentRaw(const TreeRaw&& src);

};

Expand All @@ -82,19 +67,19 @@ struct Document : DocumentRaw {


public:
inline Document(DocumentRaw&& ref):DocumentRaw(std::move(ref)){}
inline Document(const DocumentRaw&& ref):DocumentRaw(std::move(ref)){}
Document(DocumentRaw&& ref);
Document(const DocumentRaw&& ref);

inline const Tree slice(const element_t* ref=nullptr) const{return DocumentRaw::slice(ref);}
inline Tree clone(const element_t* ref=nullptr, bool reduce=true) const{return DocumentRaw::clone(ref,reduce);}
const Tree slice(const element_t* ref=nullptr) const;
Tree clone(const element_t* ref=nullptr, bool reduce=true) const;

inline wrp::base_t<unknown_t> root() {return wrp::base_t<unknown_t>{*(const TreeRaw*)this, &TreeRaw::root()};}
wrp::base_t<unknown_t> root();

///Cast this document as a raw document
inline DocumentRaw& downgrade(){return *this;}
DocumentRaw& downgrade();

///Cast this const document as a const raw tree
inline const DocumentRaw& downgrade() const{return *this;}
const DocumentRaw& downgrade() const;
};


Expand Down
1 change: 1 addition & 0 deletions include/vs-xml/fwd/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Forwards to decide which libraries are used internally by `vs.xml`. Not to be used externally, unless the respective libraries are also linked.
37 changes: 20 additions & 17 deletions include/vs-xml/fwd/format.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,21 +10,24 @@
*
*/

#if VS_XML_USE_FMT == true && __has_include(<fmt/core.h>)

#if VS_XML_USE_FMT == true && __has_include(<fmt/core.h>)

#include <fmt/core.h>

namespace xml{
using fmt::format;
}

#else

#include <format>

namespace xml{
using std::format;
}

#endif
#include <fmt/core.h>

namespace xml{
using fmt::format;
}

#else

#if VS_XML_USE_FMT == true && !__has_include(<fmt/core.h>)
#warning "Unable to use fmt, header missing"
#endif

#include <format>

namespace xml{
using std::format;
}

#endif
38 changes: 21 additions & 17 deletions include/vs-xml/fwd/print.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -11,20 +11,24 @@
*/


#if VS_XML_USE_FMT == true && __has_include(<fmt/ostream.h>)

#include <fmt/ostream.h>

namespace xml{
using fmt::print;
}

#else

#include <print>

namespace xml{
using std::print;
}

#endif
#if VS_XML_USE_FMT == true && __has_include(<fmt/ostream.h>)

#include <fmt/ostream.h>

namespace xml{
using fmt::print;
}

#else

#if VS_XML_USE_FMT == true && !__has_include(<fmt/ostream.h>)
#warning "Unable to use fmt, header missing"
#endif

#include <print>

namespace xml{
using std::print;
}

#endif
14 changes: 7 additions & 7 deletions include/vs-xml/meson.build
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
version_components = meson.project_version().split('.')
major_version = version_components[0]
minor_version = version_components[1]
rev_version = version_components[2]

conf = configuration_data()
conf.set('VS_XML_VERSION_MAJOR', major_version)
conf.set('VS_XML_VERSION_MINOR', minor_version)
Expand All @@ -19,22 +24,17 @@ conf.set('VS_XML_NS', get_option('ns'))
conf.set('VS_XML_LAYOUT', get_option('binlayout'))

if get_option('use_fmt')
fmt_dep = dependency('fmt')
conf.set('VS_XML_USE_FMT','true')
else
fmt_dep = []
endif

if get_option('use_gtl')
gtl_dep = dependency('gtl')
conf.set('VS_XML_USE_GTL','true')
else
gtl_dep = []
endif

cfgfile = configure_file(output : 'config.hpp',
configuration : conf,
)

install_headers(cfgfile, subdir:'vs-xml')
install_subdir('.', install_dir : 'include/vs-xml', strip_directory: false, follow_symlinks: true, exclude_files: ['meson.build','config.hpp.in'] )
#TODO: Exclude private subdir
install_subdir('.', install_dir : get_option('includedir')+'/vs-xml', strip_directory: false, follow_symlinks: true, exclude_files: ['meson.build','config.hpp.in'] )
2 changes: 2 additions & 0 deletions include/vs-xml/module.modulemap
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
module xml {
header "commons.hpp"
header "node.hpp"
header "wrp-node.hpp"
header "tree-builder.hpp"
header "document-builder.hpp"
header "archive-builder.hpp"
Expand Down
Loading