AI Assistant Voicemail

AI Assistant Voicemail — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • INDECT

    INDECT

    INDECT is a research project in the area of intelligent security systems performed by several European universities since 2009 and funded by the European Union. The purpose of the project is to involve European scientists and researchers in the development of solutions to and tools for automatic threat detection through e.g. processing of CCTV camera data streams, standardization of video sequence quality for user applications, threat detection in computer networks as well as data and privacy protection. The area of research, applied methods, and techniques are described in the public deliverables which are available to the public on the project's website. Practically, all information related to the research is public. Only documents that comprise information related to financial data or information that could negatively influence the competitiveness and law enforcement capabilities of parties involved in the project are not published. This follows regulations and practices applied in EU research projects. == Application and target users == The main end-user of INDECT solutions are police forces and security services. The principle of operation of the project is detecting threats and identifying sources of threats, without monitoring and searching for particular citizens or groups of citizens. Then, the system operator (i.e. police officer) decides whether an intervention of services responsible for public security are required or not. Further investigation eventually leading to persons related to threats is performed, preserving the presumption of innocence, based on existing procedures already used by police services and prosecutors. As it can be found in the project deliverables, INDECT does not involve storage of personal data (such as names, addresses, identity document numbers, etc.). A similar, behavior-based surveillance program was SAMURAI (Suspicious and Abnormal behavior Monitoring Using a netwoRk of cAmeras & sensors for sItuation awareness enhancement). == Expected results == The main expected results of the INDECT project are: Trial of intelligent analysis of video and audio data for threat detection in urban environments Creation of tools and technology for privacy and data protection during storage and transmission of information using quantum cryptography and new methods of digital watermarking Performing computer-aided detection of threats and targeted crimes in Internet resources with privacy-protecting solutions Construction of a search engine for rapid semantic search based on watermarking of content related to child pornography and human organ trafficking Implementation of a distributed computer system that is capable of effective intelligent processing == Controversy == Some media and other sources accuse INDECT of privacy abuse, collecting personal data, and keeping information from the public. Consequently, these issues have been commented and discussed by some Members of the European Parliament. As seen in the project's documentation, INDECT does not involve mobile phone tracking or call interception. The rumors about testing INDECT during 2012 UEFA European Football Championship also turned out to be false. The mid-term review of the Seventh Framework Programme to the European Parliament strongly urges the European Commission to immediately make all documents available and to define a clear and strict mandate for the research goal, the application, and the end users of INDECT, and stresses a thorough investigation of the possible impact on fundamental rights. Nevertheless, according to Mr. Paweł Kowal, MEP, the project had the ethical review on 15 March 2011 in Brussels with the participation of ethics experts from Austria, France, Netherlands, Germany and Great Britain.

    Read more →
  • Transaction data

    Transaction data

    Transaction data or transaction information is a category of data describing transactions. Transaction data/information gather variables generally referring to reference data or master data – e.g. dates, times, time zones, currencies. Typical transactions are: Financial transactions about orders, invoices, payments; Work transactions about plans, activity records; Logistic transactions about deliveries, storage records, travel records, etc.. == Management == Recording and storing transactions is called records management. The record of the transaction is stored in a place where the retention can be guaranteed and where data is archived or removed following a retention period. Formats of recorded transactions can be digital data in databases and spreadsheets, or handwritten texts in physical documents like former bankbooks. Transaction processing systems are application software that generate transactions and manage transaction data/information, e.g. SAP and Oracle Financials. == Data warehousing == Transaction data can be summarised in a data warehouse, which helps accessibility and analysis of the data.

    Read more →
  • Kullback–Leibler Upper Confidence Bound

    Kullback–Leibler Upper Confidence Bound

    In multi-armed bandit problems, KL-UCB (for Kullback–Leibler Upper Confidence Bound) is a UCB-type algorithm that is asymptotically optimal, in the sense that its regret matches the problem-dependent Lai-Robbins lower bound. == Multi-armed bandit problem == The Multi-armed bandit problem is a sequential game where one player has to choose at each turn between K {\displaystyle K} actions (arms). Behind every arm a {\displaystyle a} there is an unknown distribution ν a {\displaystyle \nu _{a}} that lies in a set D {\displaystyle {\mathcal {D}}} known by the player (for example, D {\displaystyle {\mathcal {D}}} can be the set of Gaussian distributions or Bernoulli distributions). At each turn t {\displaystyle t} the player chooses (pulls) an arm a t {\displaystyle a_{t}} , he then gets an observation X t {\displaystyle X_{t}} of the distribution ν a t {\displaystyle \nu _{a_{t}}} . === Regret minimization === The goal is to minimize the regret at time T {\displaystyle T} that is defined as R T := ∑ a = 1 K Δ a E [ N a ( T ) ] {\displaystyle R_{T}:=\sum _{a=1}^{K}\Delta _{a}\mathbb {E} [N_{a}(T)]} where μ a := E [ ν a ] {\displaystyle \mu _{a}:=\mathbb {E} [\nu _{a}]} is the mean of arm a {\displaystyle a} μ ∗ := max a μ a {\displaystyle \mu ^{}:=\max _{a}\mu _{a}} is the highest mean Δ a := μ ∗ − μ a {\displaystyle \Delta _{a}:=\mu ^{}-\mu _{a}} N a ( t ) {\displaystyle N_{a}(t)} is the number of pulls of arm a {\displaystyle a} up to turn t {\displaystyle t} The player has to find an algorithm that chooses at each turn t {\displaystyle t} which arm to pull based on the previous actions and observations ( a s , X s ) s < t {\displaystyle (a_{s},X_{s})_{s μ } {\displaystyle {\mathcal {K}}_{inf}(\nu ,\mu ,{\mathcal {D}}):=\inf \left\{\mathrm {KL} (\nu ,{\tilde {\nu }})\ |\ {\tilde {\nu }}\in {\mathcal {D}},\ \mathbb {E} [{\tilde {\nu }}]>\mu \right\}} K L {\displaystyle \mathrm {KL} } is the Kullback–Leibler divergence ν ^ a ( t ) {\displaystyle {\hat {\nu }}_{a}(t)} is the empirical distribution of arm a {\displaystyle a} at turn t {\displaystyle t} δ t {\displaystyle \delta _{t}} is a well-chosen sequence of positive numbers, often equal to ln ⁡ t + c ln ⁡ ln ⁡ t {\displaystyle \ln t+c\ln \ln t} with c > 0 {\displaystyle c>0} . Then we choose the arm a t {\displaystyle a_{t}} with the highest index: a t := arg ⁡ max a U a ( t ) {\displaystyle a_{t}:=\arg \max _{a}U_{a}(t)} We note that the algorithm does not require knowledge of T {\displaystyle T} . === Example === In the special case of Gaussian distribution with fixed variance σ 2 {\displaystyle \sigma ^{2}} , we have: U a ( t ) = μ ^ a ( t ) + 2 σ 2 δ t N a ( t ) {\displaystyle U_{a}(t)={\hat {\mu }}_{a}(t)+{\sqrt {\frac {2\sigma ^{2}\delta _{t}}{N_{a}(t)}}}} with μ ^ a ( t ) {\displaystyle {\hat {\mu }}_{a}(t)} being the empirical mean of arm a {\displaystyle a} at turn t {\displaystyle t} . === Pseudocode === The player gets the set D for each arm i do: n[i] ← 1; nu[i] ← None; d ← ln(K) for t from 1 to K do: select arm t observe reward r n[t] ← n[t] + 1 nu[t] ← update empirical distribution for t from K+1 to T do: for each arm i do: index[i] ← compute_index(n[i], nu[i], D, d) select arm a with highest index[a] observe reward r n[a] ← n[a] + 1 nu[a] ← update empirical distribution d ← ln(t+1) == Theoretical results == In the multi-armed bandit problem we have the Lai–Robbins asymptotic lower bound on regret. The algorithm KL-UCB matches this lower bound for one-dimensional exponential families with δ t := ln ⁡ t + 3 ln ⁡ ln ⁡ t {\displaystyle \delta _{t}:=\ln t+3\ln \ln t} and for distributions bounded in [ 0 , 1 ] {\displaystyle [0,1]} with δ t := ln ⁡ t + ln ⁡ ln ⁡ t {\displaystyle \delta _{t}:=\ln t+\ln \ln t} . === Lai–Robbins lower bound === In 1985 Lai and Robbins proved an asymptotic, problem-dependent lower bound on regret. It states that for every consistent algorithm on the set D {\displaystyle {\mathcal {D}}} — that is, an algorithm for which, for every ( ν 1 , … , ν K ) ∈ D K {\displaystyle (\nu _{1},\dots ,\nu _{K})\in {\mathcal {D}}^{K}} , the regret R T {\displaystyle R_{T}} is subpolynomial (i.e. R T = o T → + ∞ ( T α ) {\displaystyle R_{T}=o_{T\to +\infty }(T^{\alpha })} for all α > 0 {\displaystyle \alpha >0} ) — we have: R T ≥ ( ∑ a : μ a < μ ∗ Δ a K inf ( ν a , μ ∗ , D ) ) ln ⁡ T + o T → + ∞ ( ln ⁡ T ) . {\displaystyle R_{T}\geq \left(\sum _{a:\mu _{a}<\mu ^{}}{\frac {\Delta _{a}}{{\mathcal {K}}_{\inf }(\nu _{a},\mu ^{},{\mathcal {D}})}}\right)\ln T+o_{T\to +\infty }(\ln T).} This bound is asymptotic (as T → + ∞ {\displaystyle T\to +\infty } ) and gives a first-order lower bound of order ln ⁡ T {\displaystyle \ln T} with the optimal constant in front of it. === Regret bound for KL-UCB === The algorithm matches the Lai–Robbins lower bound for one-dimensional exponential-family distributions and for distributions bounded in [ 0 , 1 ] {\displaystyle [0,1]} . ==== One-dimensional exponential family ==== For D {\displaystyle {\mathcal {D}}} being the set of one-dimensional exponential families, with δ t := ln ⁡ t + 3 ln ⁡ ln ⁡ t {\displaystyle \delta _{t}:=\ln t+3\ln \ln t} we have the following upper bound on the regret of KL-UCB: R T ≤ ( ∑ a : μ a < μ ∗ Δ a K inf ( ν a , μ ∗ , D ) ) ln ⁡ T + O T ( ln ⁡ T ) . {\displaystyle R_{T}\leq \left(\sum _{a:\mu _{a}<\mu ^{}}{\frac {\Delta _{a}}{{\mathcal {K}}_{\inf }(\nu _{a},\mu ^{},{\mathcal {D}})}}\right)\ln T+O_{T}({\sqrt {\ln T}}).} ==== Bounded distributions in [0,1] ==== For D = P ( [ 0 , 1 ] ) {\displaystyle {\mathcal {D}}={\mathcal {P}}([0,1])} (the set of distributions supported on [ 0 , 1 ] {\displaystyle [0,1]} ), and for δ t := ln ⁡ t + ln ⁡ ln ⁡ t {\displaystyle \delta _{t}:=\ln t+\ln \ln t} , we have the following upper bound on the regret of KL-UCB: R T ≤ ( ∑ a : μ a < μ ∗ Δ a K inf ( ν a , μ ∗ , D ) ) ln ⁡ T + O T ( ( ln ⁡ T ) 4 / 5 ln ⁡ ln ⁡ T ) . {\displaystyle R_{T}\leq \left(\sum _{a:\mu _{a}<\mu ^{}}{\frac {\Delta _{a}}{{\mathcal {K}}_{\inf }(\nu _{a},\mu ^{},{\mathcal {D}})}}\right)\ln T+O_{T}{\big (}(\ln T)^{4/5}\ln \ln T{\big )}.} === Runtime === For D = P ( [ 0 , 1 ] ) {\displaystyle {\mathcal {D}}={\mathcal {P}}([0,1])} , the runtime needed per step and for an arm k {\displaystyle k} with n {\displaystyle n} observations is O ( n ( ln ⁡ n ) 2 ) {\displaystyle {\mathcal {O}}{\big (}n(\ln n)^{2}{\big )}} . This is higher than that of other optimal algorithms, such as NPTS with O ( n ) {\displaystyle {\mathcal {O}}(n)} . MED with O ( n ln ⁡ n ) {\displaystyle {\mathcal {O}}(n\ln n)} . and IMED with O ( n ln ⁡ n ) {\displaystyle {\mathcal {O}}(n\ln n)} . The high runtime of KL-UCB is due to a two-level optimisation: for each arm and candidate mean μ {\displaystyle \mu } , the algorithm evaluates K inf ( ν ^ a ( t ) , μ , D ) {\displaystyle {\mathcal {K}}_{\inf }({\hat {\nu }}_{a}(t),\mu ,{\mathcal {D}})} and then maximises μ {\displaystyle \mu } subject to N a ( t ) K inf ( ν ^ a ( t ) , μ , D ) ≤ δ t {\displaystyle N_{a}(t)\,{\mathcal {K}}_{\inf }({\hat {\nu }}_{a}(t),\mu ,{\mathcal {D}})\leq \delta _{t}} . For distributions bounded in [ 0 , 1 ] {\displaystyle [0,1]} the inner problem has no closed form and must be solved numerically, which increases the per-step cost.

    Read more →
  • SQL programming tool

    SQL programming tool

    In the field of software, SQL programming tools provide platforms for database administrators (DBAs) and application developers to perform daily tasks efficiently and accurately. Database administrators and application developers often face constantly changing environments which they rarely completely control. Many changes result from new development projects or from modifications to existing code, which, when deployed to production, do not always produce the expected result. For organizations to better manage development projects and the teams that develop code, suppliers of SQL programming tools normally provide more than facility to the database administrator or application developer to aid in database management and in quality code-deployment practices. == Features == SQL programming tools may include the following features: === SQL editing === SQL editors allow users to edit and execute SQL statements. They may support the following features: cut, copy, paste, undo, redo, find (and replace), bookmarks block indent, print, save file, uppercase/lowercase keyword highlighting auto-completion access to frequently used files output of query result editing query-results committing and rolling-back transactions inside cut paper === Object browsing === Tools may display information about database objects relevant to developers or to database administrators. Users may: view object descriptions view object definitions (DDL) create database objects enable and disable triggers and constraints recompile valid or invalid objects query or edit tables and views Some tools also provide features to display dependencies among objects, and allow users to expand these dependent objects recursively (for example: packages may reference views, views generally reference tables, super/subtypes, and so on). === Session browsing === Database administrators and application developers can use session browsing tools to view the current activities of each user in the database. They can check the resource-usage of individual users, statistics information, locked objects and the current running SQL of each individual session. === User-security management === DBAs can create, edit, delete, disable or enable user-accounts in the database using security-management tools. DBAs can also assign roles, system privileges, object privileges, and storage-quotas to users. === Debugging === Some tools offer features for the debugging of stored procedures: step in, step over, step out, run until exception, breakpoints, view & set variables, view call stack, and so on. Users can debug any program-unit without making any modification to it, including triggers and object types. === Performance monitoring === Monitoring tools may show the database resources — usage summary, service time summary, recent activities, top sessions, session history or top SQL — in easy-to-read graphs. Database administrators can easily monitor the health of various components in the monitoring instance. Application developers may also make use of such tools to diagnose and correct application-performance problems as well as improve SQL server performance. === Test data === Test data generation tools can populate the database by realistic test data for server or client side testing purposes. Also, this kind of software can upload sample blob files to database.

    Read more →
  • Server.com

    Server.com

    Server.com is a domain name that was owned by software as a service (SaaS) company Server Corporation. They offered a suite of services from 1996 until 2007. It was the first SaaS site to offer a variety of services and the first to use the term WebApp to describe its services. It was selected as an Incredibly Useful Site by Yahoo! Internet Life magazine. net magazine listed Server.com among the 100 most influential websites of all time. Server.com launched in 1996 offering the first online personal information manager. In 1997, they rolled out the first threaded message board service; the first web based mailing list manager; one of the first online calendar services; and one of the first online form builders. In 2000, Server.com partnered with NBCi and became server.snap.com until 2001. In 2001, Server.com was serving 100 million monthly pageviews. Media Life declared it one of the 20 biggest ad domains on the Web. In 2002, Server.com developed one of the first web-based RSS aggregators. In 2007, all services were moved to YourWebApps.com. The domain name Server.com was sold in 2009 for $770,000.

    Read more →
  • Single-source publishing

    Single-source publishing

    Single-source publishing, also known as single-sourcing publishing, is a content management method which allows the same source content to be used across different forms of media and more than one time. The labor-intensive and expensive work of editing need only be carried out once, on only one document; that source document (the single source of truth) can then be stored in one place and reused. This reduces the potential for error, as corrections are only made one time in the source document. The benefits of single-source publishing primarily relate to the editor rather than the user. The user benefits from the consistency that single-sourcing brings to terminology and information. This assumes the content manager has applied an organized conceptualization to the underlying content (A poor conceptualization can make single-source publishing less useful). Single-source publishing is sometimes used synonymously with multi-channel publishing though whether or not the two terms are synonymous is a matter of discussion. == Definition == While there is a general definition of single-source publishing, there is no single official delineation between single-source publishing and multi-channel publishing, nor are there any official governing bodies to provide such a delineation. Single-source publishing is most often understood as the creation of one source document in an authoring tool and converting that document into different file formats or human languages (or both) multiple times with minimal effort. Multi-channel publishing can either be seen as synonymous with single-source publishing, or similar in that there is one source document but the process itself results in more than a mere reproduction of that source. == History == The origins of single-source publishing lie, indirectly, with the release of Windows 3.0 in 1990. With the eclipsing of MS-DOS by graphical user interfaces, help files went from being unreadable text along the bottom of the screen to hypertext systems such as WinHelp. On-screen help interfaces allowed software companies to cease the printing of large, expensive help manuals with their products, reducing costs for both producer and consumer. This system raised opportunities as well, and many developers fundamentally changed the way they thought about publishing. Writers of software documentation did not simply move from being writers of traditional bound books to writers of electronic publishing, but rather they became authors of central documents which could be reused multiple times across multiple formats. The first single-source publishing project was started in 1993 by Cornelia Hofmann at Schneider Electric in Seligenstadt, using software based on Interleaf to automatically create paper documentation in multiple languages based on a single original source file. XML, developed during the mid- to late-1990s, was also significant to the development of single-source publishing as a method. XML, a markup language, allows developers to separate their documentation into two layers: a shell-like layer based on presentation and a core-like layer based on the actual written content. This method allows developers to write the content only one time while switching it in and out of multiple different formats and delivery methods. In the mid-1990s, several firms began creating and using single-source content for technical documentation (Boeing Helicopter, Sikorsky Aviation and Pratt & Whitney Canada) and user manuals (Ford owners manuals) based on tagged SGML and XML content generated using the Arbortext Epic editor with add-on functions developed by a contractor. The concept behind this usage was that complex, hierarchical content that did not lend itself to discrete componentization could be used across a variety of requirements by tagging the differences within a single document using the capabilities built into SGML and XML. Ford, for example, was able to tag its single owner's manual files so that 12 model years could be generated via a resolution script running on the single completed file. Pratt & Whitney, likewise, was able to tag up to 20 subsets of its jet engine manuals in single-source files, calling out the desired version at publication time. World Book Encyclopedia also used the concept to tag its articles for American and British versions of English. Starting from the early 2000s, single-source publishing was used with an increasing frequency in the field of technical translation. It is still regarded as the most efficient method of publishing the same material in different languages. Once a printed manual was translated, for example, the online help for the software program which the manual accompanies could be automatically generated using the method. Metadata could be created for an entire manual and individual pages or files could then be translated from that metadata with only one step, removing the need to recreate information or even database structures. Although single-source publishing is now decades old, its importance has increased urgently as of the 2010s. As consumption of information products rises and the number of target audiences expands, so does the work of developers and content creators. Within the industry of software and its documentation, there is a perception that the choice is to embrace single-source publishing or render one's operations obsolete. == Criticism == Editors using single-source publishing have been criticized for below-standard work quality, leading some critics to describe single-source publishing as the "conveyor belt assembly" of content creation. While heavily used in technical translation, there are risks of error in regard to indexing. While two words might be synonyms in English, they may not be synonyms in another language. In a document produced via single-sourcing, the index will be translated automatically and the two words will be rendered as synonyms. This is because they are synonyms in the source language, while in the target language they are not.

    Read more →
  • UI data binding

    UI data binding

    UI data binding is a software design pattern to simplify development of GUI applications. UI data binding binds UI elements to an application domain model. Most frameworks employ the Observer pattern as the underlying binding mechanism. To work efficiently, UI data binding has to address input validation and data type mapping. A bound control is a widget whose value is tied or bound to a field in a recordset (e.g., a column in a row of a table). Changes made to data within the control are automatically saved to the database when the control's exit event triggers. == Example == == Data binding frameworks and tools == === Delphi === DSharp third-party data binding tool OpenWire Visual Live Binding - third-party visual data binding tool === Java === JFace Data Binding JavaFX Property === .NET === Windows Forms data binding overview WPF data binding overview Avalonia Unity 3D data binding framework (available in modifications for NGUI, iGUI and EZGUI libraries) === JavaScript === Angular AngularJS Backbone.js Ember.js Datum.js knockout.js Meteor, via its Blaze live update engine OpenUI5 React Vue.js

    Read more →
  • Distributed transaction

    Distributed transaction

    A distributed transaction operates within a distributed environment, typically involving multiple nodes across a network depending on the location of the data. A key aspect of distributed transactions is atomicity, which ensures that the transaction is completed in its entirety or not executed at all. It's essential to note that distributed transactions are not limited to databases. The Open Group, a vendor consortium, proposed the X/Open Distributed Transaction Processing Model (X/Open XA), which became a de facto standard for the behavior of transaction model components. Databases are common transactional resources and, often, transactions span a couple of such databases. In this case, a distributed transaction can be seen as a database transaction that must be synchronized (or provide ACID properties) among multiple participating databases which are distributed among different physical locations. The isolation property (the I of ACID) poses a special challenge for multi database transactions, since the (global) serializability property could be violated, even if each database provides it (see also global serializability). In practice most commercial database systems use strong strict two-phase locking (SS2PL) for concurrency control, which ensures global serializability, if all the participating databases employ it. A common algorithm for ensuring correct completion of a distributed transaction is the two-phase commit (2PC). This algorithm is usually applied for updates able to commit in a short period of time, ranging from couple of milliseconds to couple of minutes. There are also long-lived distributed transactions, for example a transaction to book a trip, which consists of booking a flight, a rental car and a hotel. Since booking the flight might take up to a day to get a confirmation, two-phase commit is not applicable here, it will lock the resources for this long. In this case more sophisticated techniques that involve multiple undo levels are used. The way you can undo the hotel booking by calling a desk and cancelling the reservation, a system can be designed to undo certain operations (unless they are irreversibly finished). In practice, long-lived distributed transactions are implemented in systems based on web services. Usually these transactions utilize principles of compensating transactions, Optimism and Isolation Without Locking. The X/Open standard does not cover long-lived distributed transactions. Several technologies, including Jakarta Enterprise Beans and Microsoft Transaction Server fully support distributed transaction standards. == Synchronization == In event-driven architectures, distributed transactions can be synchronized through using request–response paradigm and it can be implemented in two ways: Creating two separate queues: one for requests and the other for replies. The event producer must wait until it receives the response. Creating one dedicated ephemeral queue for each request.

    Read more →
  • Lossless join decomposition

    Lossless join decomposition

    In database design, a lossless join decomposition is a decomposition of a relation r {\displaystyle r} into relations r 1 , r 2 {\displaystyle r_{1},r_{2}} such that a natural join of the two smaller relations yields back the original relation. This is central in removing redundancy safely from databases while preserving the original data. Lossless join can also be called non-additive. == Definition == A relation r {\displaystyle r} on schema R {\displaystyle R} decomposes losslessly onto schemas R 1 {\displaystyle R_{1}} and R 2 {\displaystyle R_{2}} if π R 1 ( r ) ⋈ π R 2 ( r ) = r {\displaystyle \pi _{R_{1}}(r)\bowtie \pi _{R_{2}}(r)=r} , that is r {\displaystyle r} is the natural join of its projections onto the smaller schemas. A pair ( R 1 , R 2 ) {\displaystyle (R_{1},R_{2})} is a lossless-join decomposition of R {\displaystyle R} or said to have a lossless join with respect to a set of functional dependencies F {\displaystyle F} if any relation r ( R ) {\displaystyle r(R)} that satisfies F {\displaystyle F} decomposes losslessly onto R 1 {\displaystyle R_{1}} and R 2 {\displaystyle R_{2}} . Decompositions into more than two schemas can be defined in the same way. == Criteria == A decomposition R = R 1 ∪ R 2 {\displaystyle R=R_{1}\cup R_{2}} has a lossless join with respect to F {\displaystyle F} if and only if the closure of R 1 ∩ R 2 {\displaystyle R_{1}\cap R_{2}} includes R 1 ∖ R 2 {\displaystyle R_{1}\setminus R_{2}} or R 2 ∖ R 1 {\displaystyle R_{2}\setminus R_{1}} . In other words, one of the following must hold: ( R 1 ∩ R 2 ) → ( R 1 ∖ R 2 ) ∈ F + {\displaystyle (R_{1}\cap R_{2})\to (R_{1}\setminus R_{2})\in F^{+}} ( R 1 ∩ R 2 ) → ( R 2 ∖ R 1 ) ∈ F + {\displaystyle (R_{1}\cap R_{2})\to (R_{2}\setminus R_{1})\in F^{+}} === Criteria for multiple sub-schemas === Multiple sub-schemas R 1 , R 2 , . . . , R n {\displaystyle R_{1},R_{2},...,R_{n}} have a lossless join if there is some way in which we can repeatedly perform lossless joins until all the schemas have been joined into a single schema. Once we have a new sub-schema made from a lossless join, we are not allowed to use any of its isolated sub-schema to join with any of the other schemas. For example, if we can do a lossless join on a pair of schemas R i , R j {\displaystyle R_{i},R_{j}} to form a new schema R i , j {\displaystyle R_{i,j}} , we use this new schema (rather than R i {\displaystyle R_{i}} or R j {\displaystyle R_{j}} ) to form a lossless join with another schema R k {\displaystyle R_{k}} (which may already be joined (e.g., R k , l {\displaystyle R_{k,l}} )). == Example == Let R = { A , B , C , D } {\displaystyle R=\{A,B,C,D\}} be the relation schema, with attributes A, B, C and D. Let F = { A → B C } {\displaystyle F=\{A\rightarrow BC\}} be the set of functional dependencies. Decomposition into R 1 = { A , B , C } {\displaystyle R_{1}=\{A,B,C\}} and R 2 = { A , D } {\displaystyle R_{2}=\{A,D\}} is lossless under F because R 1 ∩ R 2 = A {\displaystyle R_{1}\cap R_{2}=A} and we have a functional dependency A → B C {\displaystyle A\rightarrow BC} . In other words, we have proven that ( R 1 ∩ R 2 → R 1 ∖ R 2 ) ∈ F + {\displaystyle (R_{1}\cap R_{2}\rightarrow R_{1}\setminus R_{2})\in F^{+}} .

    Read more →
  • Mobile content management system

    Mobile content management system

    A mobile content management system (MCMs) is a type of content management system (CMS) capable of storing and delivering content and services to mobile devices, such as mobile phones, smart phones, and PDAs. Mobile content management systems may be discrete systems, or may exist as features, modules or add-ons of larger content management systems capable of multi-channel content delivery. Mobile content delivery has unique, specific constraints including widely variable device capacities, small screen size, limitations on wireless bandwidth, sometimes small storage capacity, and (for some devices) comparatively weak device processors. Demand for mobile content management increased as mobile devices became increasingly ubiquitous and sophisticated. MCMS technology initially focused on the business to consumer (B2C) mobile market place with ringtones, games, text-messaging, news, and other related content. Since, mobile content management systems have also taken root in business-to-business (B2B) and business-to-employee (B2E) situations, allowing companies to provide more timely information and functionality to business partners and mobile workforces in an increasingly efficient manner. A 2008 estimate put global revenue for mobile content management at US$8 billion. == Key features == === Multi-channel content delivery === Multi-channel content delivery capabilities allow users not to manage a central content repository while simultaneously delivering that content to mobile devices such as mobile phones, smartphones, tablets and other mobile devices. Content can be stored in a raw format (such as Microsoft Word, Excel, PowerPoint, PDF, Text, HTML etc.) to which device-specific presentation styles can be applied. === Content access control === Access control includes authorization, authentication, access approval to each content. In many cases the access control also includes download control, wipe-out for specific user, time specific access. For the authentication, MCM shall have basic authentication which has user ID and password. For higher security many MCM supports IP authentication and mobile device authentication. === Specialized templating system === While traditional web content management systems handle templates for only a handful of web browsers, mobile CMS templates must be adapted to the very wide range of target devices with different capacities and limitations. There are two approaches to adapting templates: multi-client and multi-site. The multi-client approach makes it possible to see all versions of a site at the same domain (e.g. sitename.com), and templates are presented based on the device client used for viewing. The multi-site approach displays the mobile site on a targeted sub-domain (e.g. mobile.sitename.com). === Location-based content delivery === Location-based content delivery provides targeted content, such as information, advertisements, maps, directions, and news, to mobile devices based on current physical location. Currently, GPS (global positioning system) navigation systems offer the most popular location-based services. Navigation systems are specialized systems, but incorporating mobile phone functionality makes greater exploitation of location-aware content delivery possible.

    Read more →
  • Seismological Facility for the Advancement of Geoscience

    Seismological Facility for the Advancement of Geoscience

    The U.S. National Science Foundation's Seismological Facility for the Advancement of Geoscience (NSF SAGE) is a distributed, multi-user national facility that provides support for state of-the-art seismic research. It is operated by EarthScope Consortium. Its previous operator was the Incorporated Research Institutions for Seismology (IRIS), until its merger with UNAVCO to become EarthScope Consortium. NSF SAGE is one of the two premier geophysical facilities in support of geoscience and geoscience education of the National Science Foundation. The other premiere geophysical facility is NSF GAGE, the Geodetic Facility for the Advancement of Geoscience. The services of the facility include support for the Global Seismographic Network (GSN), Data Services, and instrument support via the EarthScope Primary Instrument Center (EPIC), including magnetotelluric (MT) geophysical research. == Global Seismographic Network (GSN) == NSF SAGE manages 40 stations of the 152-station Global Seismographic Network (GSN) for basic global seismicity and Earth structure research. The GSN also enables earthquake hazard mission-related data operations such as: Earthquake location and characterization Tsunami warning Nuclear explosion monitoring == Data Services == SAGE Data Services (DS) is the largest facility for the archiving, curation, and distribution of seismological and other geophysical data in the world. == EarthScope Primary Instrument Center (EPIC) == The EPIC facility maintains the largest open access, shared-use pool of portable seismic sensors in the world. It is located on the campus of New Mexico Tech. == MT == NSF SAGE provides instruments for magnetotelluric (MT) or electromagnetic geophysical research for the recording of our planet's ambient electric and magnetic fields, which allow for the characterization of the conductivity of the area consisting of the shallow crust to upper mantle. This helps with analysis of results obtained from seismic imaging methodologies. The NSF SAGE facility is: Developing open source MT data formatting and processing software. Providing access to proprietary software products.

    Read more →
  • Magic Quadrant

    Magic Quadrant

    Magic Quadrant (MQ) is a series of market research reports published by research and advisory firm Gartner that rely on proprietary qualitative data analysis methods to demonstrate market trends, such as direction, maturity, and participants. Their analyses are conducted for several specific technology industries and are updated every 1–2 years: once an updated report has been published, its predecessor is "retired". == Rating == Gartner rates vendors upon two criteria: completeness of vision and ability to execute. Completeness of vision – Reflects the vendor's innovation, and whether the vendor drives or follows the market. Ability to execute – Summarizes factors such as the vendor's financial viability, market responsiveness, product development, sales channels and customer base. The two component scores lead to a vendor position in one of four quadrants: === Leaders === Vendors in the "Leaders" quadrant have the highest composite scores for their completeness of vision and ability to execute. A vendor in the Leaders quadrant has the market share, credibility, and marketing & sales capabilities needed to drive the acceptance of new technologies. These vendors demonstrate a clear understanding of market needs, they are innovators and thought leaders, and they have well-articulated plans that customers and prospects can use when designing their infrastructures and strategies. In addition, they have a presence in the five major geographical regions, consistent financial performance, and broad platform support. === Challengers === Vendors in the "Challengers" quadrant have high scores mainly for their ability to execute. They both participate in the market and execute well enough to be a serious threat to vendors in the "Leaders" quadrant. They have strong products, as well as sufficiently credible market position and resources to sustain continued growth. Financial viability is not an issue for vendors in the "Challengers" quadrant, but they lack the size and influence of vendors in the "Leaders" quadrant due to their relative lack of vision. === Visionaries === Vendors in the "Visionaries" quadrant have high scores mainly for their completeness of vision. They deliver innovative products that address operationally or financially important end-user problems at a broad scale, but have not yet demonstrated the ability to capture market share or maintain sustainable levels of profitability. Visionary vendors are frequently privately held companies and acquisition targets for larger, established companies. The likelihood of acquisition often reduces the risks associated with installing their systems. === Niche Players === Vendors in the "Niche Players" quadrant have relatively low scores for both their ability to execute and their completeness of vision. They are often narrowly focused on specific market or vertical segments. This quadrant often also includes vendors that are adapting their existing products to enter the market under consideration, or larger vendors having difficulty developing and executing on their vision. == Gartner Critical Capabilities == Gartner Critical Capabilities complement Magic Quadrant analysis to offer deeper insight into the products and services offered by multiple vendors by a comparative analysis that scores competing products or services against a set of critical differentiators identified by Gartner. Gartner has periodically ended Magic Quadrant listings for IT Service Management, Web Content Management, and other industries as those markets have fully matured or other factors rendered the analytic framework inapplicable. == Criticism == The Magic Quadrant, and analysts in general, skew the market: according to research, by applying their methodologies to describe a market, they change that marketplace to fit their tools. Another criticism is that open source vendors are not considered sufficiently by analysts like Gartner, as has been published in an online discussion between a VP from Talend and a German Research VP from Gartner. On May 29, 2009 (2009-05-29), software vendor ZL Technologies filed a federal lawsuit against Gartner that challenged the "legitimacy" of Gartner's Magic Quadrant rating system. Gartner filed a motion to dismiss by claiming First Amendment protection since it contends that its MQ reports contain "pure opinion", which legally means opinions that are not based on fact. The court threw out the ZL case because it lacked a specific complaint. The decision was upheld on appeal.

    Read more →
  • BulSemCor

    BulSemCor

    The Bulgarian Sense-annotated Corpus (BulSemCor) (Bulgarian: Български семантично анотиран корпус (БулСемКор)) is a structured corpus of Bulgarian texts in which each lexical item is assigned a sense tag. BulSemCor was created by the Department of Computational Linguistics at the Institute for Bulgarian Language of the Bulgarian Academy of Sciences. == Structure == BulSemCor was created as part of a nationally funded project titled "BulNet – A lexico-semantic network for the Bulgarian Language" (2005–2010). It follows the general methodology of SemCor combined with some specific principles. The corpus for annotation consists of 101,791 tokens covering an excerpt from the Bulgarian "Brown" Corpus modelled on the Brown Corpus.Francis Kucera An important feature of BulSemCor is that the samples are selected using heuristics that provide optimal coverage of ambiguous lexis. BulSemCor is manually sense-annotated according to the Bulgarian WordNet. Its size is comparable to that of other contemporary semantically annotated corpora or pool of acceptable linguistic components. The semantic annotation consists in associating each lexical item in the corpus with exactly one synonym set (synset) in the Bulgarian WordNet that best describes its sense in the particular context. The selection of the best match among the suggested candidates is based on a set of procedures, such as the other synset members, the synset gloss (explanatory definition) and the position of a given candidate in the WordNet structure. == Scale == The number of annotated tokens is 99,480 (the difference in the number of tokens compared to the initial corpus is due to the fact that some of them are not linguistic items). The simple word count is 86,842 and multiword expressions (MWE) are 5,797 (12,638 tokens). == Specific features == All words in BulSemCor are assigned a sense, while according to established practice only simple content words or content word classes (typically nouns and verbs) are annotated. Since 2000 the development of language resources, has broadened to include annotation of function words and multiword expressions covering particular senses or types of words and expressions. In this respect, BulSemCor's annotation is more exhaustive and hence provides greater opportunities for linguistic observations and non-linear programming (NLP) applications. Annotated items inherit the linguistic information associated with the corresponding synset, which along with morphological and semantic tags may include annotation on one or more of the following additional levels: Partial information about the syntactic structure of MWE types – particularly, information about syntactic heads and their dependents; Information about the category of the named entities – names, locations, organisations, dates, numbers, etc.; Information about the taxonomic category of adverbs, such as time, place, manner, degree, quantity, etc.; Information about the type of the syntactic relationships – coordination or subordination – expressed by conjunctions; Information about the original part-of-speech of substantivised words (non-nouns that act as nouns in a particular context); Stylistic/register, grammatical and other information about synsets or individual synset members;

    Read more →
  • Ontology alignment

    Ontology alignment

    Ontology alignment, or ontology matching, is the process of determining correspondences between concepts in ontologies. A set of correspondences is also called an alignment. The phrase takes on a slightly different meaning, in computer science, cognitive science or philosophy. == Computer science == For computer scientists, concepts are expressed as labels for data. Historically, the need for ontology alignment arose out of the need to integrate heterogeneous databases, ones developed independently and thus each having their own data vocabulary. In the Semantic Web context involving many actors providing their own ontologies, ontology matching has taken a critical place for helping heterogeneous resources to interoperate. Ontology alignment tools find classes of data that are semantically equivalent, for example, "truck" and "lorry". The classes are not necessarily logically identical. According to Euzenat and Shvaiko (2007), there are three major dimensions for similarity: syntactic, external, and semantic. Coincidentally, they roughly correspond to the dimensions identified by Cognitive Scientists below. A number of tools and frameworks have been developed for aligning ontologies, some with inspiration from Cognitive Science and some independently. Ontology alignment tools have generally been developed to operate on database schemas, XML schemas, taxonomies, formal languages, entity-relationship models, dictionaries, and other label frameworks. They are usually converted to a graph representation before being matched. Since the emergence of the Semantic Web, such graphs can be represented in the Resource Description Framework line of languages by triples of the form , as illustrated in the Notation 3 syntax. In this context, aligning ontologies is sometimes referred to as "ontology matching". The problem of Ontology Alignment has been tackled recently by trying to compute matching first and mapping (based on the matching) in an automatic fashion. Systems like DSSim, X-SOM or COMA++ obtained at the moment very high precision and recall. The Ontology Alignment Evaluation Initiative aims to evaluate, compare and improve the different approaches. === Formal definition === Given two ontologies i = ⟨ C i , R i , I i , T i , V i ⟩ {\displaystyle i=\langle C_{i},R_{i},I_{i},T_{i},V_{i}\rangle } and j = ⟨ C j , R j , I j , T j , V j ⟩ {\displaystyle j=\langle C_{j},R_{j},I_{j},T_{j},V_{j}\rangle } where C {\displaystyle C} is the set of classes, R {\displaystyle R} is the set of relations, I {\displaystyle I} is the set of individuals, T {\displaystyle T} is the set of data types, and V {\displaystyle V} is the set of values, we can define different types of (inter-ontology) relationships. Such relationships will be called, all together, alignments and can be categorized among different dimensions: similarity vs logic: this is the difference between matchings (predicating about the similarity of ontology terms), and mappings (logical axioms, typically expressing logical equivalence or inclusion among ontology terms) atomic vs complex: whether the alignments we considered are one-to-one, or can involve more terms in a query-like formulation (e.g., LAV/GAV mapping) homogeneous vs heterogeneous: do the alignments predicate on terms of the same type (e.g., classes are related only to classes, individuals to individuals, etc.) or we allow heterogeneity in the relationship? type of alignment: the semantics associated to an alignment. It can be subsumption, equivalence, disjointness, part-of or any user-specified relationship. Subsumption, atomic, homogeneous alignments are the building blocks to obtain richer alignments, and have a well defined semantics in every Description Logic. Let's now introduce more formally ontology matching and mapping. An atomic homogeneous matching is an alignment that carries a similarity degree s ∈ [ 0 , 1 ] {\displaystyle s\in [0,1]} , describing the similarity of two terms of the input ontologies i {\displaystyle i} and j {\displaystyle j} . Matching can be either computed, by means of heuristic algorithms, or inferred from other matchings. Formally we can say that, a matching is a quadruple m = ⟨ i d , t i , t j , s ⟩ {\displaystyle m=\langle id,t_{i},t_{j},s\rangle } , where t i {\displaystyle t_{i}} and t j {\displaystyle t_{j}} are homogeneous ontology terms, s {\displaystyle s} is the similarity degree of m {\displaystyle m} . A (subsumption, homogeneous, atomic) mapping is defined as a pair μ = ⟨ t i , t j ⟩ {\displaystyle \mu =\langle t_{i},t_{j}\rangle } , where t i {\displaystyle t_{i}} and t j {\displaystyle t_{j}} are homogeneous ontology terms. == Cognitive science == For cognitive scientists interested in ontology alignment, the "concepts" are nodes in a semantic network that reside in brains as "conceptual systems." The focal question is: if everyone has unique experiences and thus different semantic networks, then how can we ever understand each other? This question has been addressed by a model called ABSURDIST (Aligning Between Systems Using Relations Derived Inside Systems for Translation). Three major dimensions have been identified for similarity as equations for "internal similarity, external similarity, and mutual inhibition." == Ontology alignment methods == Two sub research fields have emerged in ontology mapping, namely monolingual ontology mapping and cross-lingual ontology mapping. The former refers to the mapping of ontologies in the same natural language, whereas the latter refers to "the process of establishing relationships among ontological resources from two or more independent ontologies where each ontology is labelled in a different natural language". Existing matching methods in monolingual ontology mapping are discussed in Euzenat and Shvaiko (2007). Approaches to cross-lingual ontology mapping are presented in Fu et al. (2011).

    Read more →
  • E-Science librarianship

    E-Science librarianship

    E-Science librarianship refers to a role for librarians in e-Science. == Early scholars == Early references to e-Science and librarianship involve information studies scholars researching cyberinfrastructure and emerging networked information and knowledge communities. Notably Christine Borgman, Professor and Presidential Chair in Information Studies at the University of California, Los Angeles (UCLA) was a key player in bringing e-Science, and the idea of networked knowledge communities, to the attention of the library profession. In 2004, as a visiting fellow at the Oxford Internet Institute, she conducted research and lectured publicly on e-Science, Digital Libraries, and Knowledge Communities. In 2007 Anna K. Gold, formerly of MIT and Cal Poly, San Luis Obispo, authored a series of articles in D-Lib Magazine that opened the door for academic libraries to begin exploring roles, skills, and strategies for engaging in e-Science: Cyberinfrastructure, Data, and Libraries, Part 1: A Cyberinfrastructure Primer for Librarians and Cyberinfrastructure, Data, and Libraries, Part 2: Libraries and the Data Challenge: Roles and Actions for Libraries. == Academic research and health sciences libraries == In 2007, the Association of Research Libraries (ARL) e-Science task force issued its report on e-Science and librarianship. The ARL's report encouraged its member libraries to position themselves to engage with researchers involved in e-Science (eScience) by cultivating new research support strategies and developing their digital scholarship infrastructure. E-Science has multiple attributes; Tony and Jessie Hey framed e-Science for the library community by characterizing it as a research methodology: "e-Science is not a new scientific discipline in its own right: e-Science is shorthand for the set of tools and technologies required to support collaborative, networked science". In addition to academic libraries' interests in providing support for their researchers engaging in e-Science, the health sciences library community also emerged as a major proponent for creating librarian positions for supporting the information needs of large-scale, networked, research collaborations on their campuses. Neil Rambo, current director of NYU's Health Sciences Library and former director of University of Washington Health Sciences Library, was the first to use the term in the Journal of the Medical Library Association, in his 2009 editorial e-Science and the Biomedical Library. Rambo's definition of e-Science highlighted the potential e-Science held for creating data as a research product: "E-science is a new research methodology, fueled by networked capabilities and the practical possibility of gathering and storing vast amounts of data." In response to this article the University of Massachusetts Medical School Lamar Soutter Library and National Network of Libraries of Medicine, New England Region encouraged health sciences libraries to cooperate to identify skills and develop a program for training e-Science Librarians. Then, in 2013, Shannon Bohle, an archivist who was employed in the library at Cold Spring Harbor Laboratory, an NCI-designated basic cancer research facility, used experience gained there and previous papers and presentations about preserving scientific archival materials to expand the traditional definition of e-Science by including the terms, principles, and practices used in archival science. These included in the definition the "long-term storage and accessibility of all materials generated through the scientific process," as well as examples of material types traditionally preserved in archives, like "electronic/digitized laboratory notebooks, raw and fitted data sets, manuscript production and draft versions, pre-prints," as well as library materials ("print and/or electronic publications"). == Roles == Many areas of science are about to be transformed by the availability of vast amounts of new scientific data that can potentially provide insights at a level of detail never before envisaged. However, this new data dominant era brings new challenges for the scientists and they will need the skills and technologies both of computer scientists and of the library community to manage, search and curate these new data resources. Libraries will not be immune from change in this new world of research. Karen Williams identifies roles in the following areas for librarians in the developing world of e-Science. Campus Engagement Content/Collection Development and Management Teaching and Learning Scholarly Communication E-Scholarship and Digital Tools Reference/Help Services Outreach Fund Raising Exhibit and Event Planning Leadership == Challenges for research libraries == E-science tends toward inter- and multidisciplinary approaches that depend on computation and computer science. Research libraries have traditionally been discipline focused and, although increasingly technologically sophisticated, do not have systems of the scale or complexity of the e-science environment. E-science is data intensive, but research libraries have not typically been responsible for scientific data. E-science is frequently conducted in a team context, often distributed across multiple institutions and on a global scale. The primary constituency of libraries generally comprises those affiliated with the local institution. Licenses for electronic content are typically restricted to a particular institutional community, and the infrastructure to move institutional licenses into a multi-institutional environment is not well developed. E-science challenges all these traditional paradigms of research library organization and services. == Skills == Garritano & Carlson were among the first to outline a skill set for librarians seeking to support the data needs of e-Science; they identified five skill categories librarians new to this area should expect to adapt or develop when participating on such projects: Library and information science expertise Subject expertise Partnerships and outreach (both internal and external) Participating in sponsored research Balancing workload An example of librarians reconfiguring traditional librarian skills to meet the needs of researchers engaging in e-Science is Witt & Carlson's adaptation of the traditional reference interview into a "data interview" in order to provide effective data management and e-Science services. This interview consists of ten practical queries necessary for understanding the provenance and expectations for the preservation of datasets typical of e-Science that also help illustrate some of the educational tools and skills needed by a librarian new to e-Science. "What is the story of the data? What form and format are the data in? What is the expected lifespan of the dataset? How could the data be used, reused, and repurposed? How large is the dataset, and what is its rate of growth? Who are the potential audiences for the data? Who owns the data? Does the dataset include any sensitive information? What publications or discoveries have resulted from the data? How should the data be made accessible?" == Resources == In 2009 the Lamar Soutter Library at the University of Massachusetts Medical School (UMMS) and the National Network of Libraries of Medicine, New England Region (NN/LM NER) funded an e-Science program for building the skills highlighted above for librarians. Elaine Russo Martin, Director of Library Services at the Lamar Soutter Library and Director of the NN/LM NER developed this comprehensive e-Science program to build librarians' subject expertise in the sciences, developing their data management skills, and their familiarity with cyberinfrastructure and e-Science. Three major products of this program are the e-Science web portal for librarians, the E-Science Symposium, and the New England Collaborative Data Management Curriculum (NECDMC). This portal includes educational resources for specific tools and subject/discipline tutorials and modules to assist librarians new to e-Science. UMMS and NN/LM NER also publish an open access journal called the Journal of eScience Librarianship.

    Read more →