Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use std::string_view as argument instead of std::string on C++17 Compilers #167

Merged
merged 21 commits into from
Jun 19, 2018

Conversation

jimmyjxiao
Copy link
Contributor

The major benefit is if someone tries to bind a char* parameter, we can avoid an extra copy that would happen if a new std::string was created from it. This pull request also includes some improvements for building/running the tests on MSVC, mostly because I suck at git and didn't know how to separate them into a separate pull request.

Copy link
Collaborator

@aminroosta aminroosta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me 👍
@zowpowow please update the Readme

@aminroosta aminroosta requested a review from zauguin June 17, 2018 04:48
@zauguin
Copy link
Collaborator

zauguin commented Jun 17, 2018

Thanks for the pull request!

While this works great for users passing literals and std::strings, some changes are needed to also support not null-terminated string_views.
Details inline.

if (hresult != SQLITE_OK) errors::throw_sqlite_error(hresult, utility::utf16_to_utf8(sql.data()));
if (!std::all_of(static_cast<const char16_t*>(remaining), sql.data() + sql.size(), [](char16_t ch) {return std::isspace(ch); }))
throw errors::more_statements("Multiple semicolon separated statements are unsupported", utility::utf16_to_utf8(sql.data()));
return tmp;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an advantage in not reusing the UTF-8 version?
I see some smaller problems here:

  • The SQL statement in the exception is no longer the same statement given to SQLite.
    In edge cases utf16_to_utf8 might behave differently than SQLite which can make debugging hard.
  • Less usage of utf16_to_utf8 in main code paths might lead to problems in this function to go unnoticed which again makes debugging a nightmare if they only surface at error conditions.
  • It is marginally slower because in case of an error the conversion happens twice.
  • And of course we introduce some redundant code.

Copy link
Collaborator

@zauguin zauguin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid similar problems it would be nice to add some tests.

}

sqlite3_stmt* _prepare(const std::string& sql) {
sqlite3_stmt* _prepare(const STR_REF& sql) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should accept a string_view by value instead of const reference. This applies to all functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I ask what the advantage of this would be? I know that in my use case, I'm going to have a few string_view objects already around and wouldn't passing by const reference be marginally faster than passing by value?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In most cases they will be inlined, then it doesn't make a difference. Otherwise the advantage is that a const reference requires some object to reference, so the string_view has to exists in memory(typically on the stack). But when passing by value small objects like this normally are passed in registers which does not require any usage of (slow) memory.
Most of the time the string_view will not exist in memory before(because it is a temporary which did not exists or because the optimizer keeps the values in registers), so copying 2*sizeof(void*) between registers should be much faster than writing this into memory, copying the pointer and reading from memory again.
This does mostly apply for the Itanium ABI, so not on Windows. But output from Compiler-Explorer suggests that passing by value saves a memory access on Windows too.
Additionally the interface looks cleaner 😉

Disclaimer: I did not run a benchmark on this so I could be wrong. In this case, I would love to hear why.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're probably right. I wrote some super crappy benchmarks and it seems passing by string_view was slightly faster: https://github.com/zowpowow/sqlite_modern_cpp/tree/str_view_by_value

}

sqlite3_stmt* _prepare(const std::string& sql) {
sqlite3_stmt* _prepare(const STR_REF& sql) {
int hresult;
sqlite3_stmt* tmp = nullptr;
const char *remaining;

This comment was marked as resolved.

@@ -362,7 +370,7 @@ namespace sqlite {
std::shared_ptr<sqlite3> _db;

public:
database(const std::string &db_name, const sqlite_config &config = {}): _db(nullptr) {
database(const STR_REF &db_name, const sqlite_config &config = {}): _db(nullptr) {
sqlite3* tmp = nullptr;
auto ret = sqlite3_open_v2(db_name.data(), &tmp, static_cast<int>(config.flags), config.zVfs);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes that db_name is null-terminated. The db_name ends up as a filenamem, so not null-terminated strings are not supported here.
Maybe we should keep std::string here and optionally add a const char * overload?

database(const std::u16string &db_name, const sqlite_config &config = {}): _db(nullptr) {
auto db_name_utf8 = utility::utf16_to_utf8(db_name);
database(const U16STR_REF &db_name, const sqlite_config &config = {}): _db(nullptr) {
auto db_name_utf8 = utility::utf16_to_utf8(db_name.data());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same problem as above.

typedef utility::function_traits<Function> traits;

auto funcPtr = new auto(std::forward<Function>(func));
if(int result = sqlite3_create_function_v2(
_db.get(), name.c_str(), traits::arity, SQLITE_UTF8, funcPtr,
_db.get(), name.data(), traits::arity, SQLITE_UTF8, funcPtr,

This comment was marked as resolved.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should stick to .c_str() here because it makes clearer that we really need a c-style string, especially the trailing nul byte.

template<>
struct has_sqlite_type<std::string, SQLITE3_TEXT, void> : std::true_type {};

inline int bind_col_in_db(sqlite3_stmt* stmt, int inx, const std::string& val) {
inline int bind_col_in_db(sqlite3_stmt* stmt, int inx, const STR_REF& val) {
return sqlite3_bind_text(stmt, inx, val.data(), -1, SQLITE_TRANSIENT);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 has to be replaced with the actual length.

@@ -170,30 +182,29 @@ namespace sqlite {
std::string(reinterpret_cast<char const *>(sqlite3_value_text(value)), sqlite3_value_bytes(value));
}

inline void store_result_in_db(sqlite3_context* db, const std::string& val) {
inline void store_result_in_db(sqlite3_context* db, const STR_REF& val) {
sqlite3_result_text(db, val.data(), -1, SQLITE_TRANSIENT);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 has to be replaced with the actual length.

template<>
struct has_sqlite_type<std::u16string, SQLITE3_TEXT, void> : std::true_type {};

inline int bind_col_in_db(sqlite3_stmt* stmt, int inx, const std::u16string& val) {
inline int bind_col_in_db(sqlite3_stmt* stmt, int inx, const U16STR_REF& val) {
return sqlite3_bind_text16(stmt, inx, val.data(), -1, SQLITE_TRANSIENT);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 has to be replaced with the actual length.

return sqlite3_value_type(value) == SQLITE_NULL ? std::u16string() :
std::u16string(reinterpret_cast<char16_t const *>(sqlite3_value_text16(value)), sqlite3_value_bytes16(value));
}

inline void store_result_in_db(sqlite3_context* db, const std::u16string& val) {
inline void store_result_in_db(sqlite3_context* db, const U16STR_REF& val) {
sqlite3_result_text16(db, val.data(), -1, SQLITE_TRANSIENT);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 has to be replaced with the actual length.

@@ -3,7 +3,7 @@ OPTION(ENABLE_SQLCIPHER_TESTS "enable sqlchipher test")

# Creates the file compile_commands.json in the build directory.
SET(CMAKE_EXPORT_COMPILE_COMMANDS ON)
set (CMAKE_CXX_STANDARD 14)
set (CMAKE_CXX_STANDARD 17)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still support C++14, so it would make sense to keep this set to 14. Especially since our Travis compiler is so old that it does not support most C++17 features anyway.

@zauguin
Copy link
Collaborator

zauguin commented Jun 18, 2018

Nice. Updating the ReadMe and adding a test would be great. There are some non-functional changes left but these are details.

It is quite sad that this doesn't really work out for function and database names. These should mostly be covered by small string optimization anyway but we could overload them for C-strings to avoid temporary strings at least when a literal is given? After merging this we might want a separate Pull Request in that direction.

Copy link
Collaborator

@zauguin zauguin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some smaller possible changes

if(ret != SQLITE_OK) errors::throw_sqlite_error(_db ? sqlite3_extended_errcode(_db.get()) : ret);
sqlite3_extended_result_codes(_db.get(), true);
if(config.encoding != Encoding::UTF8)
database(const std::u16string &db_name, const sqlite_config &config = {}): database(utility::utf16_to_utf8(db_name.data()), config) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.data() is useless here and causes a unnecessary strlen

typedef utility::function_traits<Function> traits;

auto funcPtr = new auto(std::forward<Function>(func));
if(int result = sqlite3_create_function_v2(
_db.get(), name.c_str(), traits::arity, SQLITE_UTF8, funcPtr,
_db.get(), name.data(), traits::arity, SQLITE_UTF8, funcPtr,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should stick to .c_str() here because it makes clearer that we really need a c-style string, especially the trailing nul byte.

sqlite_exception(const char* msg, std::string sql, int code = -1): runtime_error(msg), code(code), sql(sql) {}
sqlite_exception(int code, std::string sql): runtime_error(sqlite3_errstr(code)), code(code), sql(sql) {}
sqlite_exception(const char* msg, STR_REF sql, int code = -1): runtime_error(msg), code(code), sql(std::move(sql)) {}
sqlite_exception(int code, STR_REF sql): runtime_error(sqlite3_errstr(code)), code(code), sql(std::move(sql)) {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The std::move is useless ere, moving from a const reference or a string_view is just copying.

@@ -40,7 +40,7 @@ namespace sqlite {
class more_statements: public sqlite_exception { using sqlite_exception::sqlite_exception; }; // Prepared statements can only contain one statement
class invalid_utf16: public sqlite_exception { using sqlite_exception::sqlite_exception; };

static void throw_sqlite_error(const int& error_code, const std::string &sql = "") {
static void throw_sqlite_error(const int& error_code, const STR_REF &sql = "") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is still a const&

#else
typedef const std::string& STR_REF;
typedef const std::u16string& U16STR_REF;
#endif
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The typedef should be in out namespace to avoid introducing names in the global namespace. Also they are no longer macros, so the names should no longer be uppercase.

@@ -8,7 +8,7 @@

namespace sqlite {
namespace utility {
inline std::string utf16_to_utf8(const std::u16string &input) {
inline std::string utf16_to_utf8(const U16STR_REF &input) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another const&.

@jimmyjxiao
Copy link
Contributor Author

While we're on the topic of avoiding extra copies, is SQLITE_TRANSIENT really necessary over SQLITE_STATIC?

@zauguin
Copy link
Collaborator

zauguin commented Jun 18, 2018

If we know we have a literal string we could use SQLITE_STATIC but when the user gives us a std::string(_view), we can not know how long it is valid. For example using the iterator interface or prepared statements, passed temporarys will very often be destroyed before the statement is complete.
We could document the requirements and provide optional overloads for SQLITE_STATIC but I think the reduced safety is not worth the probably rather small performance advantage. Especially because I do not hink there is a way to detect errors here without something like valgrind and in most cases the database disk write will be significantly more expensive than the string copy anyway.

Of course, I would be happy to hear about ideas how this could work.

@jimmyjxiao
Copy link
Contributor Author

Alright, Readme and tests have been updated, can we take another look at merging? Also for the SQLITE_STATIC thing, take a look here: jimmyjxiao@5d5ce07 on how I think it could be implemented

@aminroosta aminroosta merged commit 8327ea6 into SqliteModernCpp:dev Jun 19, 2018
@aminroosta
Copy link
Collaborator

aminroosta commented Jun 19, 2018

@zowpowow About SQLITE_STATIC, @5d5ce07 looks promising.
However i'm more worried about the interface, instead of having

db << "select * where name = ?" << binding_mode::static << "tom";

I'd recommend

db << "select * where name = ?" << sqlite::as_static("tom");

Please see #122 , we have a discussion on supporting blob types without vector<T>,
The idea was the the same to have << sqlite::blob(void*, size_t) overloaded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants